Skip to content

Phase 1: The Atomic Era (Primitives)

Goal: Master numerical stability and gradient flow.

These are the building blocks that every neural network is made of:

Modules

Module Status Description
Linear Layer 🔲 Weight initialization, forward pass
Activations 🔲 ReLU, GELU, SiLU, SwiGLU
Loss Functions 🔲 MSE, CrossEntropy, LogSumExp trick
Normalization 🔲 BatchNorm, LayerNorm, RMSNorm, GroupNorm
Regularization 🔲 Dropout, L1/L2 penalty
Optimizers 🔲 SGD, Momentum, Adam, AdamW
LR Schedulers 🔲 Warmup, StepLR, CosineAnnealing
Gradient Clipping 🔲 Norm and value clipping

🎯 Capstone

Train an MLP on MNIST using only your implementations. Compare numerical accuracy with torch.nn.