🧠

Module

Deep Learning & Neural Networks

Progress20%

4 / 20 pages

Lesson 1: Neurons & Perceptrons — Building Blocks

Lesson 2: Forward & Backpropagation — How Networks Learn

Lesson 3: Loss Functions & Optimization (Adam, SGD)

Lesson 4: Tokenization, Word Embeddings & Word2Vec

Lesson 5: Convolutional Neural Networks (CNN) — Image Processing

Lesson 6: Recurrent Neural Networks (RNN, LSTM, GRU)

Lesson 7: Attention Mechanisms & Transformers

Lesson 8: Generative Adversarial Networks (GAN)

Lesson 9: Weight Initialization, Regularization & Dropout

Lesson 10: Transfer Learning & Model Deployment

Back to Module Overview

Page4/20

Forward & Backpropagation — How Networks Learn · Page 2 of 2

Backpropagation & Gradient Descent

Backpropagation (The Learning Algorithm)

Goal: Find weights that minimize loss.

Strategy: Compute gradients (how much to adjust each weight) and update:

1. Forward pass: compute loss
2. Backward pass: compute dL/dW for each weight
3. Update: W_new = W_old - learning_rate × dL/dW

The Chain Rule (Calculus)

To find dL/dW, use chain rule:

dL/dW = (dL/da2) × (da2/dz2) × (dz2/dW2)

Where:

dL/da2 = how much does loss depend on output?
da2/dz2 = how much does output depend on pre-activation?
dz2/dW2 = how much does pre-activation depend on weights?

Gradient Descent

Update rule:

W := W - α × ∇W Loss

Where:
- α = learning rate (how big a step to take)
- ∇W = gradient (computed by backprop)

Learning rate choices:

Too high (α = 1.0): Overshoot, diverge, unstable
Too low (α = 0.00001): Learn very slowly
Just right (α = 0.01): Stable, fast learning

Example Update

dL/dW1 = 0.05   (gradient for weight 1)
α = 0.01        (learning rate)

W1_old = 0.3
W1_new = 0.3 - 0.01 × 0.05 = 0.3 - 0.0005 = 0.2995

W1 moved slightly in direction to reduce loss!

Repeat This Process

for epoch in range(1000):
    # Forward: compute loss
    predictions = network.forward(X)
    loss = compute_loss(y, predictions)
    
    # Backward: compute gradients
    gradients = network.backward()
    
    # Update: move in negative gradient direction
    network.update_weights(gradients, learning_rate=0.01)
    
    # After 1000 iterations: weights converge to good values!

This is how all neural networks learn! 🧠

main.py

OUTPUT

▶Click "Run Code" to execute…