A state-of-the-art deep learning project implementing an advanced Convolutional Neural Network with residual connections for CIFAR-10 image classification. This project demonstrates production-ready machine learning engineering with sophisticated regularization techniques, achieving 92%+ validation accuracy while maintaining excellent generalization.
- โจ Features
- ๐๏ธ Architecture
- ๐ Dataset Overview
- ๐ ๏ธ Installation
โถ๏ธ Usage- ๐ Results & Performance
- ๐ฌ Technical Details
- ๐ธ Visualizations
- ๐ค Contributing
- ๐ License
- Dataset: CIFAR-10 (60,000 32ร32 color images, 10 classes)
- Advanced Architecture: ResNet-inspired CNN with residual blocks
- Outstanding Performance:
- 92.35% peak validation accuracy
- 91.96% final validation accuracy
- 11.17M parameters (efficiently designed)
- Well-controlled overfitting (7.4% train-val gap)
- State-of-the-Art Techniques:
- Residual connections for deeper networks
- Comprehensive data augmentation
- Batch normalization for training stability
- Advanced regularization (Dropout, Weight Decay, Label Smoothing)
- OneCycleLR learning rate scheduling
- Early stopping with model checkpointing
- Gradient clipping for training stability
- Professional ML Engineering:
- Production-ready code structure
- Comprehensive evaluation and analysis
- Advanced visualization and interpretability
- Proper train/validation/test splits
- Technologies: PyTorch, Python, Scikit-learn, Matplotlib, Seaborn, NumPy
Input (32ร32ร3)
โ
Initial Conv2d(3โ64) + BatchNorm + ReLU
โ
Residual Layer 1: 2รResidualBlock(64โ64)
โ
Residual Layer 2: 2รResidualBlock(64โ128, stride=2)
โ
Residual Layer 3: 2รResidualBlock(128โ256, stride=2)
โ
Residual Layer 4: 2รResidualBlock(256โ512, stride=2)
โ
AdaptiveAvgPool2d(1ร1) + Dropout(0.5)
โ
Linear(512โ10)
- Residual Blocks: Skip connections prevent vanishing gradients
- Batch Normalization: Stabilizes training and enables higher learning rates
- Global Average Pooling: Reduces overfitting compared to fully connected layers
- Proper Weight Initialization: Kaiming initialization for optimal gradient flow
CIFAR-10 contains 60,000 32ร32 color images across 10 classes:
| Class | Examples | Description |
|---|---|---|
| 6,000 | Various aircraft types | |
| ๐ Automobile | 6,000 | Cars, trucks, vehicles |
| ๐ฆ Bird | 6,000 | Different bird species |
| ๐ฑ Cat | 6,000 | Domestic cats |
| ๐ฆ Deer | 6,000 | Wild deer |
| ๐ Dog | 6,000 | Various dog breeds |
| ๐ธ Frog | 6,000 | Amphibians |
| ๐ Horse | 6,000 | Horses in various poses |
| ๐ข Ship | 6,000 | Maritime vessels |
| ๐ Truck | 6,000 | Large vehicles |
Data Split:
- Training: 40,000 images (80%)
- Validation: 10,000 images (20%)
- Test: 10,000 images (independent)
# 1. Clone the repository
git clone https://github.com/X-XENDROME-X/Advanced-ResNet-Image-Classification.git
# 2. Set up virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Launch Jupyter Notebook
jupyter notebook image_classification.ipynbRun through the notebook sections:
- Setup & Data Loading
- Data Preprocessing
- Architecture Design
- Training Configuration
- Model Training
- Evaluation
- Visualization
import torch
from torchvision import transforms
from PIL import Image
# Load trained model
model = torch.load('best_model.pth')
model.eval()
# Define preprocessing
transform = transforms.Compose([
transforms.Resize((32, 32)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.4914, 0.4822, 0.4465],
std=[0.2023, 0.1994, 0.2010])
])
# Make prediction
def predict_image(image_path):
image = Image.open(image_path)
image_tensor = transform(image).unsqueeze(0)
with torch.no_grad():
output = model(image_tensor)
prediction = torch.nn.functional.softmax(output, dim=1)
predicted_class = torch.argmax(prediction, dim=1)
class_names = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer',
'Dog', 'Frog', 'Horse', 'Ship', 'Truck']
return class_names[predicted_class.item()], prediction.max().item()
# Example usage
predicted_class, confidence = predict_image('your_image.jpg')
print(f"Prediction: {predicted_class} (Confidence: {confidence:.2%})")| Metric | Value | Status |
|---|---|---|
| Peak Validation Accuracy | 92.35% | โญ Excellent |
| Final Validation Accuracy | 91.96% | โญ Excellent |
| Training Accuracy | 99.32% | โ Strong |
| Overfitting Gap | 7.36% | โ Well-controlled |
| Training Time | ~44 minutes | โก Efficient |
| Parameters | 11.17M | ๐ Optimized |
- Best performing: Ship (94.90%)
- Most challenging: Bird vs. Airplane confusion
- Balanced accuracy: No significant class bias
- โ Production-Ready Performance
- โ Overfitting Control
- โ Training Stability
- โ Generalization
- โ Efficiency
- Data Augmentation: Random flips, rotations, affine transforms, color jitter
- Batch Normalization: Stabilizes training and enables higher learning rates
- Dropout (0.5): Prevents overfitting in fully connected layers
- Weight Decay (0.01): L2 regularization
- Label Smoothing (0.1): Prevents overconfident predictions
- Early Stopping: Monitors validation loss with
patience=10
- OneCycleLR Scheduler
- AdamW Optimizer
- Gradient Clipping (max_norm=1.0)
- Mixed Precision Training
- Residual Connections
- Global Average Pooling
- Kaiming Initialization
- Efficient Parameter Design (11M)
- ๐ Training Curves (loss, accuracy)
- ๐ฏ Confusion Matrix
- ๐ Sample Predictions with Confidence
- ๐ Learning Rate Schedule
- ๐ง Overfitting Analysis
Contributions are welcome! Here's how:
# Fork the repository
# Create a feature branch
git checkout -b feature-name
# Make changes, add tests, and commit
git commit -m "Add feature"
# Push and open a Pull Request
git push origin feature-name- Architecture Improvements
- Hyperparameter Optimization
- Model Interpretability (Grad-CAM, etc.)
- Deployment (API/Flask/Streamlit)
- Documentation & Tutorials
- Model Quantization & Optimization
This project is licensed under the MIT License. See the LICENSE file for details.
