AlexNet for CIFAR-10 Classification

AlexNet CIFAR-10 Classifier (Research-Grade PyTorch Implementation)

A modernized AlexNet implementation for CIFAR-10 classification with strong regularization, stable training, and real-world deployment via Hugging Face Spaces.

Abstract

This project presents a research-oriented deep learning pipeline for image classification using a modified AlexNet architecture implemented in PyTorch. The model is trained and evaluated on the CIFAR-10 dataset.

Unlike the original AlexNet designed for ImageNet-scale inputs, this implementation is adapted for small images using:

Resize (70×70)
Random Crop (64×64)
Strong regularization techniques
Modern training improvements

Final Performance

Training Accuracy: ~99.6%
Validation Accuracy: ~89.48%
Test Accuracy: ~88.63%
Early Stopping: Epoch 46/90

🔗 Live Model & Deployment Links (Hugging Face)

Trained Model Repository

You can directly download and reuse the trained model weights from the Hugging Face Model Hub:

👉 https://huggingface.co/Mudassir-08/alexnet-cifar10

This repository contains the trained PyTorch checkpoint (alexnet_cifar10.pth) which includes:

model_state_dict
optimizer_state_dict
number of classes

✔ You can load it directly for inference without retraining.

Live Inference Demo (Gradio App)

You can test the model in real-time using the deployed Hugging Face Space:

👉 https://huggingface.co/spaces/Mudassir-08/alexnet-cifar10-demo

This demo allows:

Image upload
Real-time classification
Top-3 probability predictions

⚠ Important Note: The input image must follow CIFAR-10 style preprocessing assumptions:

RGB image
Proper object-centered framing
Similar distribution to CIFAR-10 dataset (airplane, ship, cat, etc.)
Avoid unrelated or out-of-domain images for best performance

Dataset

CIFAR-10 dataset
60,000 RGB images (32×32)
10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck
50,000 training images
10,000 test images

Data Preprocessing

Training Transformations

Resize(70, 70)
RandomCrop(64, 64)
ToTensor()
Normalize(mean=0.5, std=0.5)

Testing Transformations

Resize(70, 70)
CenterCrop(64, 64)
ToTensor()
Normalize(mean=0.5, std=0.5)

Design Rationale

Resize (70×70): Improves feature richness
RandomCrop: Adds spatial invariance and augmentation
CenterCrop: Ensures deterministic evaluation
Normalization: Stabilizes gradient flow

Model Architecture (Modified AlexNet)

🔷 Feature Extractor

Conv2D(3 → 64) + BatchNorm + ReLU
MaxPool
Conv2D(64 → 192) + BatchNorm + ReLU
MaxPool
Conv2D(192 → 384) + BatchNorm + ReLU
Conv2D(384 → 256) + BatchNorm + ReLU
Conv2D(256 → 256) + BatchNorm + ReLU
MaxPool

🔷 Adaptive Pooling

AdaptiveAvgPool2D → (4×4)

🔷 Classifier

Flatten (4096)
Linear → 512 + ReLU + Dropout(0.4)
Linear → 256 + ReLU + Dropout(0.4)
Linear → 10 classes

Training Configuration

Framework: PyTorch
Optimizer: SGD
Momentum: 0.9
Learning Rate: 0.1
Scheduler: ReduceLROnPlateau
Batch Size: 256
Epochs: 90
Loss Function: CrossEntropy + Label Smoothing (0.1)
Device: CUDA (if available)

Training Strategy

Label Smoothing

Prevents overconfidence and improves generalization.

Gradient Clipping

torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
Prevents gradient explosion.

Early Stopping

Stops training when validation accuracy stops improving.

Learning Rate Scheduling

Automatically reduces LR on plateau.

Training Dynamics

Fast convergence in early epochs (1–10)
Stable learning phase (10–30)
Plateau at ~88–89% validation accuracy
Early stopping at epoch 46

Key Insight

Strong generalization achieved due to:

Dropout
Label smoothing
Stable optimization

Final Results

Training Accuracy: ~99.6%
Validation Accuracy: ~89.48%
Test Accuracy: ~88.63%

Confusion Matrix Insights

Strong classes:

airplane
ship
truck

Confusions:

cat ↔ dog
deer ↔ horse

This reflects visual similarity in CIFAR-10.

Model Summary

Input: 3 × 64 × 64 image

Feature Extractor: Conv → BN → ReLU → Pool × multiple layers

Adaptive Pool: → 4 × 4 feature map

Classifier: FC(4096 → 512 → 256 → 10)

Model Checkpoint

saved_trained_model/alexnet_cifar10.pth

Inference Pipeline

Image → Resize → Crop → Normalize → Model → Softmax → Prediction

Deployment

PyTorch inference pipeline
Gradio web interface (app.py)
Hugging Face Spaces support

Features:

Image upload
Real-time prediction
Top-3 probabilities

Project Structure

AlexNet/

├── src/

├── notebooks/

├── data/

├── saved_trained_model/

├── app.py

├── main.py

├── requirements.txt

├── README.md

Key Contributions

Modified AlexNet for CIFAR-10
Stable training pipeline
Full evaluation system
Confusion matrix analysis
Deployment-ready system

Notes

CIFAR-10 images are resized from 32×32 → 64×64
Results depend on seed and hardware
This is a research implementation, not production-grade

Author

Malik Muhammad Mudassir Iqbal
Deep Learning Research Engineer
Computer Vision • PyTorch • CNN Architectures

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AlexNet for CIFAR-10 Classification

AlexNet CIFAR-10 Classifier (Research-Grade PyTorch Implementation)

Abstract

Final Performance

🔗 Live Model & Deployment Links (Hugging Face)

Trained Model Repository

Live Inference Demo (Gradio App)

Dataset

Data Preprocessing

Training Transformations

Testing Transformations

Design Rationale

Model Architecture (Modified AlexNet)

🔷 Feature Extractor

🔷 Adaptive Pooling

🔷 Classifier

Training Configuration

Training Strategy

Label Smoothing

Gradient Clipping

Early Stopping

Learning Rate Scheduling

Training Dynamics

Key Insight

Final Results

Confusion Matrix Insights

Model Summary

Model Checkpoint

Inference Pipeline

Deployment

Project Structure

Key Contributions

Notes

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages