Page10/20
Convolutional Neural Networks (CNN) β Image Processing Β· Page 2 of 2
Pooling, Flattening & CNN Architecture
Pooling (Dimensionality Reduction)
After convolution, feature maps are still large. Pooling reduces size while keeping important info.
Max Pooling
Take the maximum value in each window:
Input (4Γ4):
[1 2 | 3 4]
[5 6 | 7 8]
-----+-----
[9 10| 11 12]
[13 14| 15 16]
Max Pool (2Γ2 window):
[6 8] β Max of [1,2,5,6], [3,4,7,8], etc.
[14 16]
Why? Keeps strongest activation (most relevant feature).
Average Pooling
Take average instead of max:
[1 2 | 3 4] Average Pool (2Γ2):
[5 6 | 7 8] [3.5 6.5]
-----+-----
[9 10| 11 12] [11.5 14.5]
[13 14| 15 16]
Full CNN Architecture
Input Image (224Γ224Γ3)
β
Conv (64 filters, 3Γ3, ReLU) β 224Γ224Γ64
β
MaxPool (2Γ2) β 112Γ112Γ64 (reduced!)
β
Conv (128 filters) β 112Γ112Γ128
β
MaxPool (2Γ2) β 56Γ56Γ128
β
Conv (256 filters) β 56Γ56Γ256
β
MaxPool (2Γ2) β 28Γ28Γ256
β
Flatten β 200,704 values
β
Dense (512, ReLU) β 512
β
Dropout (0.5) β Drop half randomly
β
Dense (1000, Softmax) β Final output
Common CNN Architectures
| Architecture | Year | Key Innovation |
|---|---|---|
| LeNet | 1998 | First CNN (MNIST) |
| AlexNet | 2012 | Deep CNN + GPUs |
| VGG | 2014 | Showed depth matters |
| ResNet | 2015 | Residual connections (skip) |
| Inception | 2015 | Multi-scale convolutions |
| MobileNet | 2017 | Lightweight for phones |
Use pre-trained: Don't train CNNs from scratch! Load pre-trained weights (ImageNet).
Why CNNs Work
- Local connectivity: Only nearby pixels connected
- Weight sharing: Same filter across all positions
- Hierarchical learning: Build up from edges to objects
- Translation invariance: Same object detected regardless of position
main.py
Loading...
OUTPUT
βΆClick "Run Code" to executeβ¦