Neural Network Basics
The foundational building block of all deep learning. Understand layers, activation functions, backpropagation, and gradient descent before moving to specialized architectures.
Core Concepts
| Concept | Plain-English Meaning |
|---|---|
| Neuron / Node | Takes weighted inputs, applies activation, outputs a value |
| Layer | A group of neurons โ input, hidden, or output layers |
| Activation Function | Adds non-linearity โ ReLU, Sigmoid, Tanh, Softmax |
| Forward Pass | Data flows from input โ layers โ output prediction |
| Loss Function | Measures how wrong the prediction is (MSE, Cross-Entropy) |
| Backpropagation | Computes gradients of loss w.r.t. each weight |
| Gradient Descent | Updates weights in the direction that reduces loss |
| Batch Normalization | Normalizes layer inputs โ stabilizes and speeds training |
Code Example
import torch
import torch.nn as nn
class MLP(nn.Module):
def __init__(self, in_dim, hidden, out_dim):
super().__init__()
self.net = nn.Sequential(
nn.Linear(in_dim, hidden),
nn.BatchNorm1d(hidden),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(hidden, out_dim)
)
def forward(self, x):
return self.net(x)
model = MLP(784, 256, 10)
optim = torch.optim.Adam(model.parameters(), lr=1e-3)
loss_fn = nn.CrossEntropyLoss()
โ Watch-outs
- Vanishing gradients in deep networks โ use ReLU not Sigmoid for hidden layers
- Always normalize input features before feeding into a neural network
- Dropout is only active during training โ use model.eval() for inference