10/20
Convolutional Neural Networks (CNN) β€” Image Processing Β· Page 2 of 2

Pooling, Flattening & CNN Architecture

Pooling (Dimensionality Reduction)

After convolution, feature maps are still large. Pooling reduces size while keeping important info.

Max Pooling

Take the maximum value in each window:

Input (4Γ—4):
[1 2 | 3 4]
[5 6 | 7 8]
-----+-----
[9 10| 11 12]
[13 14| 15 16]

Max Pool (2Γ—2 window):
[6  8]     ← Max of [1,2,5,6], [3,4,7,8], etc.
[14 16]

Why? Keeps strongest activation (most relevant feature).

Average Pooling

Take average instead of max:

[1 2 | 3 4]   Average Pool (2Γ—2):
[5 6 | 7 8]   [3.5  6.5]
-----+-----
[9 10| 11 12] [11.5 14.5]
[13 14| 15 16]

Full CNN Architecture

Input Image (224Γ—224Γ—3)
      ↓
Conv (64 filters, 3Γ—3, ReLU) β†’ 224Γ—224Γ—64
      ↓
MaxPool (2Γ—2) β†’ 112Γ—112Γ—64  (reduced!)
      ↓
Conv (128 filters) β†’ 112Γ—112Γ—128
      ↓
MaxPool (2Γ—2) β†’ 56Γ—56Γ—128
      ↓
Conv (256 filters) β†’ 56Γ—56Γ—256
      ↓
MaxPool (2Γ—2) β†’ 28Γ—28Γ—256
      ↓
Flatten β†’ 200,704 values
      ↓
Dense (512, ReLU) β†’ 512
      ↓
Dropout (0.5) β†’ Drop half randomly
      ↓
Dense (1000, Softmax) β†’ Final output

Common CNN Architectures

ArchitectureYearKey Innovation
LeNet1998First CNN (MNIST)
AlexNet2012Deep CNN + GPUs
VGG2014Showed depth matters
ResNet2015Residual connections (skip)
Inception2015Multi-scale convolutions
MobileNet2017Lightweight for phones

Use pre-trained: Don't train CNNs from scratch! Load pre-trained weights (ImageNet).

Why CNNs Work

  1. Local connectivity: Only nearby pixels connected
  2. Weight sharing: Same filter across all positions
  3. Hierarchical learning: Build up from edges to objects
  4. Translation invariance: Same object detected regardless of position
main.py
Loading...
OUTPUT
β–ΆClick "Run Code" to execute…