This project implements a neural network to classify handwritten digits from the MNIST dataset. The neural network is trained using backpropagation and gradient descent.
main.py: The main script to load data, train the neural network, and evaluate its performance.mnist-original.mat: The dataset file containing the MNIST data.Model.py: Contains the implementation of the neural network and the cost function.Prediction.py: Contains the function to make predictions using the trained neural network.RandInitialise.py: Contains the function to randomly initialize the weights of the neural network.Theta1.txtandTheta2.txt: Files to save the trained weights of the neural network.
-
Clone the repository:
git clone https://github.com/vijayn7/Digit_Classifier.git cd DIGIT_CLASSIFIER -
Install the required dependencies:
pip install numpy scipy matplotlib
-
Load the MNIST dataset and preprocess the data:
from scipy.io import loadmat data = loadmat('mnist-original.mat') X = data['data'].transpose() / 255 y = data['label'].flatten()
-
Split the data into training and testing sets:
X_train = X[:60000, :] y_train = y[:60000] X_test = X[60000:, :] y_test = y[60000:]
-
Initialize the neural network parameters:
from RandInitialise import initialise input_layer_size = 784 hidden_layer_size = 100 num_labels = 10 initial_Theta1 = initialise(hidden_layer_size, input_layer_size) initial_Theta2 = initialise(num_labels, hidden_layer_size) initial_nn_params = np.concatenate((initial_Theta1.flatten(), initial_Theta2.flatten()))
-
Train the neural network:
from scipy.optimize import minimize from Model import neural_network lambda_reg = 0.1 maxiter = 100 myargs = (input_layer_size, hidden_layer_size, num_labels, X_train, y_train, lambda_reg) results = minimize(neural_network, x0=initial_nn_params, args=myargs, options={'disp': True, 'maxiter': maxiter}, method="L-BFGS-B", jac=True) nn_params = results["x"]
-
Evaluate the model:
from Prediction import predict Theta1 = np.reshape(nn_params[:hidden_layer_size * (input_layer_size + 1)], (hidden_layer_size, input_layer_size + 1)) Theta2 = np.reshape(nn_params[hidden_layer_size * (input_layer_size + 1):], (num_labels, hidden_layer_size + 1)) pred = predict(Theta1, Theta2, X_test) print('Test Set Accuracy: {:f}'.format((np.mean(pred == y_test) * 100)))
-
Visualize the learned weights and activations:
import matplotlib.pyplot as plt def visualize_activations(X, Theta1, Theta2): m = X.shape[0] a1 = np.hstack([np.ones((m, 1)), X]) z2 = a1.dot(Theta1.T) a2 = np.hstack([np.ones((m, 1)), 1 / (1 + np.exp(-z2))]) z3 = a2.dot(Theta2.T) a3 = 1 / (1 + np.exp(-z3)) fig, ax = plt.subplots(1, 3, figsize=(15, 5)) ax[0].imshow(a1[0, 1:].reshape(28, 28), cmap='gray') ax[0].set_title('Input Layer') ax[1].imshow(a2[0, 1:].reshape(10, 10), cmap='gray') ax[1].set_title('Hidden Layer') ax[2].imshow(a3[0, :].reshape(1, 10), cmap='gray') ax[2].set_title('Output Layer') plt.show() visualize_activations(X_test[:1], Theta1, Theta2)
This project is licensed under the MIT License.