A from-scratch NumPy implementation of a multilayer perceptron trained on MNIST, built to understand backpropagation at the mathematical level.
Accuracy: ~93% on the MNIST test set
The neural network is trained using backpropagation and batch gradient descent, without relying on high-level frameworks that hide the math. TensorFlow is used only to load the MNIST dataset. Code is written to maximize readability over performance.
This series by 3b1b on neural networks was inspiration for this project.
| Layer | Details |
|---|---|
| Input | 784 neurons (28×28 flattened image) |
| Hidden | 3 hidden layers, 20 neurons each |
| Output | 10 neurons, output values in range [0,1] |
Make a virtual environment and install requirements given in requirements.txt
python3 -m venv .venv
pip install -r requirements.txt
Run train.py. It will save the data (weights and biases) to brain.npz
(database is downloaded automatically using TensorFlow)
After training, run usage.py. This can be used to find accuracy, confusion matrix and sample predictions.
Weights and biases are randomly initialised. The sigmoid function is used as the activation function for all layers. Data from MNIST database is loaded as NumPy arrays, flattened and normalized. Cost is calculated using Mean Squared Error (MSE).
- Uses sigmoid instead of ReLU.
- Uses MSE instead of cross-entropy
- Not optimized for performance
For notes on the calculus (derivatives, chain rule, and cost functions) used in this project, please refer to the docs/ folder.


