A production-ready AI framework for verifying institutional ID cards using deep learning, dual OCR engines, and advanced fraud detection.
The KU ID Verification System is an end-to-end solution that combines computer vision, optical character recognition, and security analytics to provide reliable identity document verification. Originally developed for Kathmandu University, this adaptable framework can be customized for various institutional ID systems including employee badges, membership cards, access control, and government-issued identification.
Built with security-first principles and enterprise-ready architecture, the system processes ID cards in real-time with comprehensive fraud detection and detailed confidence scoring .
| Feature Category | Capabilities |
|---|---|
| Document Processing | Visual classification, dual OCR extraction, text validation |
| Security Features | Image manipulation detection, duplicate prevention, temporal analysis |
| Deployment | Streamlit web interface, hardware optimization, cross-platform support |
| Adaptability | Trainable on custom datasets, configurable validation rules |
- Custom Model Training: Train on your specific ID card datasets - not limited to any particular institution
- Flexible Data Requirements: Works effectively with 50-200 images per class
- Privacy-First Approach: No pre-trained models included to protect institutional data
- Dual-Engine Architecture: Combines EasyOCR and Tesseract for maximum text extraction accuracy
- Intelligent Text Validation: Fuzzy string matching against expected patterns and keywords
- Configurable Thresholds: Adjustable confidence levels for different use cases
- Hardware Intelligence: Automatic detection and optimization for Apple Silicon (MPS), NVIDIA CUDA, or CPU fallback
- Memory Efficient: Batch processing with automatic resource monitoring
- Production Ready: Designed for high-throughput deployment scenarios
- Streamlit-Powered: Clean, dark-themed interface with real-time processing capabilities
- User Experience Focused: Intuitive workflow with clear status indicators and results presentation
- Responsive Design: Adapts to different devices and screen sizes
- Multi-Layer Fraud Detection: Image hashing, manipulation analysis, and temporal pattern monitoring
- Duplicate Prevention: Perceptual hashing (dhash, phash, whash) to identify duplicate submissions
- In-Memory Processing: Images processed without permanent disk storage for enhanced privacy
- Multi-Signal Confidence Scoring: Combines visual, textual, and security features for final verdict
- Uncertainty Quantification: Provides calibrated confidence scores for each verification
- Detailed Reporting: Comprehensive results with explainable decision factors
- Python 3.8+ with pip package manager
- Tesseract OCR (system dependency for text extraction)
- 50-200 images of your target ID cards for training
- Optional: GPU for faster training (CUDA/MPS supported)
# Clone the repository
git clone https://github.com/shuv-amp/ku-id-verifier.git
cd ku-id-verifier
# Create and activate virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install Python dependencies
pip install -r requirements.txt
# Install Tesseract OCR (system dependency)
# macOS:
brew install tesseract
# Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-eng
# Windows:
# Download from https://github.com/UB-Mannheim/tesseract/wiki# Launch the Streamlit web application
streamlit run app/streamlit_app.pyOnce running, visit http://localhost:8501 in your browser to access the verification interface.
- Prepare Your ID Images: Collect clear images of the ID cards you want to verify
- Access Web Interface: Open the Streamlit application in your browser
- Upload ID Image: Use the file uploader to submit an ID card image
- View Results: Examine the comprehensive verification report including:
- Visual classification result
- Extracted text content
- Security analysis findings
- Overall confidence score
The system features an intuitive web interface designed for seamless user experience :
- Clean Layout: Organized sections with clear visual hierarchy
- Real-time Processing: Live status updates during verification
- Results Visualization: Color-coded results with detailed breakdown
- Responsive Components: Adapts to tablet and mobile screens
- Touch-Friendly Controls: Optimized for mobile device interaction
- Fast Performance: Minimal loading times for better user retention
- File Upload: Drag-and-drop or traditional file selection
- Configuration Options: Adjustable settings for advanced users
- Results Export: Ability to download verification reports
| Layer | Technology | Purpose |
|---|---|---|
| Frontend | Streamlit | Web interface and user interaction |
| Machine Learning | PyTorch + EfficientNet-B0 | Visual classification and feature extraction |
| OCR Processing | EasyOCR + Tesseract | Dual-engine text extraction and validation |
| Security | Custom algorithms | Fraud detection and image authentication |
| Storage | In-memory processing | Temporary data handling with privacy protection |
graph TD
A[Input Image] --> B[Image Preprocessing]
B --> C[Deep Learning Analysis]
B --> D[Dual OCR Extraction]
C --> E[Visual Classification]
D --> F[Text Validation]
E --> G[Security Analysis]
F --> G
G --> H[Confidence Scoring]
H --> I[Final Verdict]
-
Image Preprocessing
- Resize and normalization to 224x224 pixels
- Color space standardization
- Quality assessment and optimization
-
Deep Learning Analysis
- EfficientNet-B0 backbone for visual feature extraction
- Binary classification (genuine vs fraudulent)
- Confidence score generation with uncertainty estimation
-
Dual OCR Processing
- EasyOCR Engine: Primary text region detection and extraction
- Tesseract Engine: Secondary validation and fine-grained text analysis
- Fuzzy String Matching: Text validation against expected patterns and keywords
-
Security Assessment
- Image Hash Analysis: Duplicate submission detection using perceptual hashing
- Manipulation Detection: Statistical analysis for edited or doctored images
- Temporal Analysis: Submission pattern monitoring for suspicious behavior
-
Decision Fusion
- Multi-signal confidence integration
- Weighted scoring based on feature importance
- Final verdict with comprehensive risk assessment
Customize the system behavior through src/config.py:
MODEL_CONFIG = {
"model_name": "EnhancedEfficientNetOCRModel",
"backbone": "efficientnet_b0", # EfficientNet backbone
"num_classes": 2, # Binary classification
"ocr_feature_dim": 25, # OCR feature dimensions
"dropout_rate": 0.4, # Regularization
"input_size": 224 # Input image size
}TRAINING_CONFIG = {
"epochs": 50, # Training epochs
"batch_size": 8, # Batch size (adjust based on GPU memory)
"learning_rate": 0.001, # Learning rate
"weight_decay": 1e-4, # L2 regularization
"patience": 10, # Early stopping patience
"data_augmentation": True # Use data augmentation
}OCR_CONFIG = {
"engines": ["easyocr", "tesseract"], # OCR engines to use
"confidence_threshold": 0.6, # Minimum confidence score
"fuzzy_matching": True, # Enable fuzzy text matching
"keywords": ["kathmandu", "university", "ku", "student"] # Expected text patterns
}FRAUD_CONFIG = {
"hash_algorithms": ["dhash", "phash", "whash"], # Duplicate detection
"manipulation_threshold": 0.8, # Image manipulation sensitivity
"temporal_window": 3600, # 1 hour analysis window
"max_requests_per_hour": 100 # Rate limiting threshold
}The system implements multiple layers of security to ensure reliable verification while protecting user privacy:
- In-Memory Processing: All images processed temporarily without permanent storage
- Data Minimization: Only essential features extracted and retained
- Encrypted Operations: Sensitive operations use encryption for data protection
- Multi-Algorithm Hashing: Combination of dhash, phash, and whash for robust duplicate detection
- Manipulation Detection: Error Level Analysis (ELA) to identify edited images
- Temporal Analysis: Monitoring of submission patterns to detect automated attacks
- No Pre-trained Models: Prevents potential bias from external datasets
- Institutional Control: Each organization trains on their specific ID cards
- Transparent Processing: Clear reporting of verification factors and confidence scores
Security Rationale:
- 🛡️ Privacy Protection: Prevents exposure of sensitive institutional data patterns
- 🔒 Security Prevention: Avoids misuse for creating counterfeit IDs
- 🎯 Customization Required: Forces training on your specific use case
- ⚖️ Legal Compliance: Respects institutional intellectual property
Create the following directory structure for your training data:
data/
├── genuine_id/ # 100+ images of legitimate IDs
│ ├── id_001.jpg
│ ├── id_002.jpg
│ └── ...
└── fraudulent_id/ # 100+ images of other cards/forged IDs
├── fake_001.jpg
├── fake_002.jpg
└── ...
Dataset Guidelines:
- Image Quality: Include varied angles, lighting conditions, and backgrounds
- Real-World Variety: Incorporate both clear and slightly challenging images
- Privacy Considerations: Blur/remove personal information if needed for privacy
- Minimum Requirements: 100+ images per class recommended for robust training
Option A: Google Colab (Recommended for Beginners)
# Open and run notebooks/train_model.ipynb in Google Colab
# The notebook provides step-by-step instructions for:
# 1. Dataset upload to Google Drive
# 2. Environment setup and dependency installation
# 3. Model training with progress monitoring
# 4. Model export and downloadOption B: Local Training Script
# Using the provided training script
./scripts/train.sh
# Or direct execution
python src/train.py \
--data_dir ./data \
--epochs 50 \
--batch_size 8 \
--output_dir ./modelsThe training process includes:
- Progress Tracking: Real-time loss and accuracy metrics
- Validation Testing: Periodic evaluation on held-out test set
- Early Stopping: Automatic training halt when validation performance plateaus
- Model Checkpoints: Best performing model automatically saved
# Place your trained model in the models directory
cp path/to/your/trained/model.pth models/best_ku_id_model.pth
# Launch the application with your custom model
streamlit run app/streamlit_app.py| Issue | Cause | Solution |
|---|---|---|
| Model Loading Error | Missing or corrupted model file | Verify file exists: ls -la models/best_ku_id_model.pth |
| OCR Failures | Tesseract not installed | Verify installation: tesseract --version |
| Memory Errors | Large batch size or insufficient RAM | Reduce batch size in config or use CPU |
| Import Errors | Missing dependencies | Update packages: pip install -r requirements.txt --upgrade |
| Performance Issues | Hardware configuration | Enable GPU support or optimize settings |
Hardware-Specific Tips:
- NVIDIA GPU Users: Ensure CUDA drivers are updated for maximum performance
- Apple Silicon: Automatic MPS detection provides accelerated training
- CPU-Only Systems: Reduce batch size and consider smaller model variants
Application Optimization:
- Caching: Streamlit caching for expensive computations
- Batch Processing: Group operations for better resource utilization
- Memory Management: Automatic cleanup of temporary data
We welcome contributions from the community! Whether you're fixing bugs, adding new features, or improving documentation, your efforts are appreciated.
# Fork and clone the repository
git clone https://github.com/yourusername/ku-id-verifier.git
cd ku-id-verifier
# Create development environment
python3 -m venv venv-dev
source venv-dev/bin/activate
# Install development dependencies
pip install -r requirements.txt
pip install pytest black flake8 mypy pre-commit
# Set up pre-commit hooks
pre-commit install- Fork the Repository: Create your personal fork of the main repository
- Create a Feature Branch: Use descriptive branch names (
feature/amazing-feature) - Follow Code Standards: Adhere to existing code style and documentation practices
- Test Thoroughly: Ensure all functionality works correctly
- Submit Pull Request: Provide clear description of changes and testing performed
MIT License - see LICENSE file for details.
- ✅ Commercial use and distribution
- ✅ Modification and adaptation
- ✅ Private use and internal deployment
- ✅ Patent use under license terms
- ❌ No liability or warranty provided
- ❌ No trademark grant included
- ❌ Compliance with license terms required
This project builds upon the work of many amazing open-source projects and research efforts:
- EfficientNet architecture by Google Research for efficient visual recognition
- EasyOCR by JaidedAI for robust text detection and extraction
- Tesseract OCR by Google for reliable character recognition
- Streamlit for the revolutionary web application framework
- PyTorch community for the comprehensive deep learning ecosystem
- Research Contributors across computer vision and document analysis fields