This project is a practical comparison of two different neural network architectures for the task of classifying handwritten digits using the MNIST dataset. It demonstrates the evolution from a standard dense network (Multilayer Perceptron) to a more advanced Convolutional Neural Network (CNN).
The goal of this project is to demonstrate the practical differences in implementation, data preparation, and performance between standard dense layers and convolutional layers using TensorFlow/Keras.
The initial approach uses a classic feed-forward structure.
-
Preprocessing: 2D images are "flattened" into 1D vectors of 784 pixels (
$28 \times 28$ ). -
Architecture: Two fully connected (Dense) hidden layers with 512 and 256 neurons using
ReLUactivation. -
Optimizer:
RMSprop. -
Output: 10 neurons with
Softmaxactivation representing digits 0-9.
The second part implements a spatial-aware architecture that preserves the 2D nature of images.
- Preprocessing: Data is reshaped to
(28, 28, 1)to maintain spatial hierarchy. - Architecture: * 3 Convolutional layers (
Conv2D) with 32 and 64 filters to extract visual features (edges, curves).- Max Pooling layers (
MaxPooling2D) for spatial reduction and shift invariance. - A
Flattenoperation followed by a Dense classifier head.
- Max Pooling layers (
- Optimizer:
Adam.
| Model Architecture | Epochs | Batch Size | Test Accuracy |
|---|---|---|---|
| Dense (MLP) | 10 | 128 | ~98.1% |
| Convolutional (CNN) | 10 | 64 | ~99.2% |
MLP results:
CNN results:
Key Finding: The CNN model consistently outperforms the MLP by better understanding the spatial relationships between pixels, making it more robust to distortions.
- Python 3
- TensorFlow / Keras - Deep Learning framework.
- NumPy - Mathematical operations on arrays.
- Clone the repository:
git clone https://github.com/Mr-TwisT/CNN-MNIST.git
- Install dependencies:
pip install tensorflow numpy
- Run the script:
python MNIST-nn.py
- Data Normalization: Scaling pixel values to the [0, 1] range.
- Convolutions & Pooling: How "filters" extract features like edges and curves.
- Softmax Activation: Turning raw network outputs into interpretable percentages.
- Flattening: Bridging the gap between 2D feature maps and 1D classifiers.

