Page16/22
Introduction to Neural Networks · Page 1 of 1
The Perceptron & Layers
Neural Networks Basics
When Deep Learning > Classical ML
| Dataset Type | Best Algorithm |
|---|---|
| Tabular (< 1M rows) | XGBoost, Random Forest |
| Images | Convolutional Neural Networks (CNN) |
| Text | Transformer, RNN |
| Time Series | LSTM, Transformer |
| Tabular (>1M rows) | Deep Neural Network |
Neural Network Advantages:
- Handles unstructured data (images, text)
- Finds complex non-linear patterns
- Scales well with data
Disadvantages:
- Needs tons of data (10,000+ samples)
- Slow to train
- Hard to interpret ("black box")
- Hyperparameter tuning is complex
The Perceptron
Simplest neural network: Single neuron.
Input: X = [x1, x2, x3]
Weights: W = [w1, w2, w3]
Bias: b
Output = Activation(X·W + b)
The activation function (sigmoid, ReLU) introduces non-linearity.
Layers & Architecture
Input Layer (10 features)
↓
Hidden Layer 1 (64 neurons)
↓
Hidden Layer 2 (32 neurons)
↓
Output Layer (1 neuron → probability)
Each layer transforms data, learning increasingly abstract features:
- Layer 1: Simple patterns (edges in images)
- Layer 2: Combinations (shapes)
- Layer 3: Complex concepts (objects)
Backpropagation
How neural networks learn:
- Forward pass: Predict output
- Calculate loss: How wrong was the prediction?
- Backward pass: Compute gradients using chain rule
- Update weights: gradient descent steps
This is just gradient descent, but applied to every weight in the network!
Activation Functions
ReLU (Rectified Linear Unit)
f(x) = max(0, x)
- Pros: Fast, prevents vanishing gradient
- Cons: Dead neurons (some outputs become 0 and stop learning)
- Use: Hidden layers
Sigmoid
f(x) = 1 / (1 + e^-x) # Output between 0 and 1
- Pros: Probabilistic output
- Cons: Slow, vanishing gradient problem
- Use: Output layer for binary classification
Softmax
Converts scores to probability distribution (sum to 1)
- Use: Output layer for multi-class classification
main.py
Loading...
OUTPUT
▶Click "Run Code" to execute…