Module

Deep Learning & Neural Networks

Progress50%

10 / 20 pages

Lesson 1: Neurons & Perceptrons — Building Blocks

Lesson 2: Forward & Backpropagation — How Networks Learn

Lesson 3: Loss Functions & Optimization (Adam, SGD)

Lesson 4: Tokenization, Word Embeddings & Word2Vec

Lesson 5: Convolutional Neural Networks (CNN) — Image Processing

Lesson 6: Recurrent Neural Networks (RNN, LSTM, GRU)

Lesson 7: Attention Mechanisms & Transformers

Lesson 8: Generative Adversarial Networks (GAN)

Lesson 9: Weight Initialization, Regularization & Dropout

Lesson 10: Transfer Learning & Model Deployment

Back to Module Overview

Alt+←/→to navigatePage10/2050

Convolutional Neural Networks (CNN) — Image Processing · Page 2 of 2

Pooling, Flattening & CNN Architecture

32 min Intermediate

Pooling (Dimensionality Reduction)

After convolution, feature maps are still large. Pooling reduces size while keeping important info.

Max Pooling

Take the maximum value in each window:

Input (4×4):
[1 2 | 3 4]
[5 6 | 7 8]
-----+-----
[9 10| 11 12]
[13 14| 15 16]

Max Pool (2×2 window):
[6  8]     ← Max of [1,2,5,6], [3,4,7,8], etc.
[14 16]

Why? Keeps strongest activation (most relevant feature).

Average Pooling

Take average instead of max:

[1 2 | 3 4]   Average Pool (2×2):
[5 6 | 7 8]   [3.5  6.5]
-----+-----
[9 10| 11 12] [11.5 14.5]
[13 14| 15 16]

Full CNN Architecture

Input Image (224×224×3)
      ↓
Conv (64 filters, 3×3, ReLU) → 224×224×64
      ↓
MaxPool (2×2) → 112×112×64  (reduced!)
      ↓
Conv (128 filters) → 112×112×128
      ↓
MaxPool (2×2) → 56×56×128
      ↓
Conv (256 filters) → 56×56×256
      ↓
MaxPool (2×2) → 28×28×256
      ↓
Flatten → 200,704 values
      ↓
Dense (512, ReLU) → 512
      ↓
Dropout (0.5) → Drop half randomly
      ↓
Dense (1000, Softmax) → Final output

Common CNN Architectures

Architecture	Year	Key Innovation
LeNet	1998	First CNN (MNIST)
AlexNet	2012	Deep CNN + GPUs
VGG	2014	Showed depth matters
ResNet	2015	Residual connections (skip)
Inception	2015	Multi-scale convolutions
MobileNet	2017	Lightweight for phones

Use pre-trained: Don't train CNNs from scratch! Load pre-trained weights (ImageNet).

Why CNNs Work

Local connectivity: Only nearby pixels connected
Weight sharing: Same filter across all positions
Hierarchical learning: Build up from edges to objects
Translation invariance: Same object detected regardless of position

main.py

OUTPUT

▶Click "Run Code" to execute…