Skip to content

mudassir-08/AlexNet-Cifer10

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AlexNet for CIFAR-10 Classification

AlexNet CIFAR-10 Classifier (Research-Grade PyTorch Implementation)

Python PyTorch Status Dataset Deployment


A modernized AlexNet implementation for CIFAR-10 classification with strong regularization, stable training, and real-world deployment via Hugging Face Spaces.


Abstract

This project presents a research-oriented deep learning pipeline for image classification using a modified AlexNet architecture implemented in PyTorch. The model is trained and evaluated on the CIFAR-10 dataset.

Unlike the original AlexNet designed for ImageNet-scale inputs, this implementation is adapted for small images using:

  • Resize (70Γ—70)
  • Random Crop (64Γ—64)
  • Strong regularization techniques
  • Modern training improvements

Final Performance

  • Training Accuracy: ~99.6%
  • Validation Accuracy: ~89.48%
  • Test Accuracy: ~88.63%
  • Early Stopping: Epoch 46/90

πŸ”— Live Model & Deployment Links (Hugging Face)

Trained Model Repository

You can directly download and reuse the trained model weights from the Hugging Face Model Hub:

πŸ‘‰ https://huggingface.co/Mudassir-08/alexnet-cifar10

This repository contains the trained PyTorch checkpoint (alexnet_cifar10.pth) which includes:

  • model_state_dict
  • optimizer_state_dict
  • number of classes

βœ” You can load it directly for inference without retraining.


Live Inference Demo (Gradio App)

You can test the model in real-time using the deployed Hugging Face Space:

πŸ‘‰ https://huggingface.co/spaces/Mudassir-08/alexnet-cifar10-demo

This demo allows:

  • Image upload
  • Real-time classification
  • Top-3 probability predictions

⚠ Important Note: The input image must follow CIFAR-10 style preprocessing assumptions:

  • RGB image
  • Proper object-centered framing
  • Similar distribution to CIFAR-10 dataset (airplane, ship, cat, etc.)
  • Avoid unrelated or out-of-domain images for best performance

Dataset

  • CIFAR-10 dataset
  • 60,000 RGB images (32Γ—32)
  • 10 classes: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck
  • 50,000 training images
  • 10,000 test images

Data Preprocessing

Training Transformations

Resize(70, 70)
RandomCrop(64, 64)
ToTensor()
Normalize(mean=0.5, std=0.5)

Testing Transformations

Resize(70, 70)
CenterCrop(64, 64)
ToTensor()
Normalize(mean=0.5, std=0.5)

Design Rationale

  • Resize (70Γ—70): Improves feature richness
  • RandomCrop: Adds spatial invariance and augmentation
  • CenterCrop: Ensures deterministic evaluation
  • Normalization: Stabilizes gradient flow

Model Architecture (Modified AlexNet)

πŸ”· Feature Extractor

Conv2D(3 β†’ 64) + BatchNorm + ReLU
MaxPool
Conv2D(64 β†’ 192) + BatchNorm + ReLU
MaxPool
Conv2D(192 β†’ 384) + BatchNorm + ReLU
Conv2D(384 β†’ 256) + BatchNorm + ReLU
Conv2D(256 β†’ 256) + BatchNorm + ReLU
MaxPool

πŸ”· Adaptive Pooling

AdaptiveAvgPool2D β†’ (4Γ—4)

πŸ”· Classifier

Flatten (4096)
Linear β†’ 512 + ReLU + Dropout(0.4)
Linear β†’ 256 + ReLU + Dropout(0.4)
Linear β†’ 10 classes


Training Configuration

Framework: PyTorch
Optimizer: SGD
Momentum: 0.9
Learning Rate: 0.1
Scheduler: ReduceLROnPlateau
Batch Size: 256
Epochs: 90
Loss Function: CrossEntropy + Label Smoothing (0.1)
Device: CUDA (if available)


Training Strategy

Label Smoothing

Prevents overconfidence and improves generalization.

Gradient Clipping

torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
Prevents gradient explosion.

Early Stopping

Stops training when validation accuracy stops improving.

Learning Rate Scheduling

Automatically reduces LR on plateau.


Training Dynamics

  • Fast convergence in early epochs (1–10)
  • Stable learning phase (10–30)
  • Plateau at ~88–89% validation accuracy
  • Early stopping at epoch 46

Key Insight

Strong generalization achieved due to:

  • Dropout
  • Label smoothing
  • Stable optimization

Final Results

Training Accuracy: ~99.6%
Validation Accuracy: ~89.48%
Test Accuracy: ~88.63%


Confusion Matrix Insights

Strong classes:

  • airplane
  • ship
  • truck

Confusions:

  • cat ↔ dog
  • deer ↔ horse

This reflects visual similarity in CIFAR-10.


Model Summary

Input: 3 Γ— 64 Γ— 64 image

Feature Extractor: Conv β†’ BN β†’ ReLU β†’ Pool Γ— multiple layers

Adaptive Pool: β†’ 4 Γ— 4 feature map

Classifier: FC(4096 β†’ 512 β†’ 256 β†’ 10)


Model Checkpoint

saved_trained_model/alexnet_cifar10.pth


Inference Pipeline

Image β†’ Resize β†’ Crop β†’ Normalize β†’ Model β†’ Softmax β†’ Prediction


Deployment

  • PyTorch inference pipeline
  • Gradio web interface (app.py)
  • Hugging Face Spaces support

Features:

  • Image upload
  • Real-time prediction
  • Top-3 probabilities

Project Structure

AlexNet/

β”œβ”€β”€ src/

β”œβ”€β”€ notebooks/

β”œβ”€β”€ data/

β”œβ”€β”€ saved_trained_model/

β”œβ”€β”€ app.py

β”œβ”€β”€ main.py

β”œβ”€β”€ requirements.txt

β”œβ”€β”€ README.md


Key Contributions

  • Modified AlexNet for CIFAR-10
  • Stable training pipeline
  • Full evaluation system
  • Confusion matrix analysis
  • Deployment-ready system

Notes

  • CIFAR-10 images are resized from 32Γ—32 β†’ 64Γ—64
  • Results depend on seed and hardware
  • This is a research implementation, not production-grade

Author

Malik Muhammad Mudassir Iqbal
Deep Learning Research Engineer
Computer Vision β€’ PyTorch β€’ CNN Architectures