A production-grade NLP system that classifies product listings into 19 categories using fine-tuned DistilBERT β with real-time data drift detection, automated monitoring, and an interactive React analytics dashboard. Built the way ML systems actually run in production.
E-commerce platforms with millions of SKUs rely on accurate product categorization for search ranking, recommendations, and inventory management. Manual tagging doesn't scale. A misclassified product is effectively invisible to buyers searching the right category.
This system automates product classification at inference speeds of ~42ms per request β and crucially, detects when incoming product data starts drifting from the training distribution before accuracy silently degrades in production.
Fine-tuned on 50,000 Amazon product samples across 19 categories.
| Metric | Score |
|---|---|
| Accuracy | 68.3% |
| F1 Score (Macro) | 0.641 |
| Precision (Macro) | 0.643 |
| Recall (Macro) | 0.683 |
Context: Macro F1 of 0.641 across 19 heavily imbalanced categories is the honest aggregate. The model excels in high-signal categories and struggles in semantically overlapping ones β a known challenge in multi-class product taxonomy. See per-category breakdown below.
Top-Performing Categories:
| Category | F1 Score |
|---|---|
| π΅ Digital Music | 97.2% |
| π Amazon Fashion | 85.8% |
| πΈ Musical Instruments | 84.5% |
| π Automotive | 84.1% |
Hardest Categories (semantically overlapping):
These categories share vocabulary (e.g., "Electronics" vs "Computers", "Toys" vs "Baby Products") β a known challenge in flat multi-class taxonomies that hierarchical classification would address.
βββββββββββ
β User β
ββββββ¬βββββ
β HTTP POST
βΌ
ββββββββββββββββββββ
β FastAPI Backend β
ββββββββββ¬ββββββββββ
Inference β β Log
β β
βββββββββΌβββ ββββΌβββββββββββββββββββ
βDistilBERTβ β SQLite Predictions DB β
βClassifierβ ββββββββββββ¬βββββββββββββ
βββββ¬βββββββ β Metrics
Sync β βΌ
β βββββββββββββββββββββββββ
β β Performance Tracker β
βΌ βββββββββββββ¬βββββββββββββ
βββββββββββββββββββββββββββ β Real-time Data
β Evidently AI β β
β Drift Detector β β
ββββββββββββββ¬βββββββββββββ β
Report β β
βΌ βΌ
ββββββββββββββββββββββββββββββ
β React Dashboard β
ββββββββββββββββββββββββββββββ
Flow:
- Product listing hits the FastAPI endpoint
- DistilBERT classifier runs inference (~42ms)
- Prediction + metadata logged to SQLite
- Evidently AI monitors incoming text distribution via PSI
- Performance Tracker computes rolling accuracy metrics
- React Dashboard visualizes everything in real time
- Async inference β single and batch prediction endpoints
- Pydantic v2 validation β strict input/output schema enforcement
- Automated logging β every request logged with latency, confidence score, and prediction
- Data drift detection β Evidently AI monitors text distribution shifts using Population Stability Index (PSI)
- Performance tracking β SQLite-backed persistent logging with rolling metric calculations
- A/B testing infrastructure β Chi-square framework for comparing model versions before promoting to production
- Real-time accuracy trends and confidence distribution charts
- Inference latency monitoring
- Interactive model playground for live predictions
- Drift status indicators and system health view
- Per-category performance breakdown
- Python 3.11+
- Node.js 20+
- Docker (optional)
git clone https://github.com/Emart29/ecommerce-product-classifier.git
cd ecommerce-product-classifierpython -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtmkdir -p data/monitoring/drift_reportspython -m src.api.mainAPI live at http://localhost:8000 Β· Swagger UI at /docs
cd frontend
npm install
npm run devDashboard live at http://localhost:5173
docker-compose up --buildRequest:
{
"title": "Acoustic Guitar Starter Pack with Tuner and Bag"
}Response:
{
"category": "Musical Instruments",
"confidence": 0.921,
"latency_ms": 42.3
}Accepts an array of product objects. Full interactive docs at /docs.
# Full test suite
PYTHONPATH=. pytest tests/ -v
# Monitoring-specific tests
PYTHONPATH=. pytest tests/test_monitoring.py -vecommerce-product-classifier/
βββ src/
β βββ api/ # FastAPI routes, schemas, inference
βββ frontend/ # React analytics dashboard
βββ models/ # Fine-tuned DistilBERT weights & config
βββ data/
β βββ monitoring/
β βββ drift_reports/ # Evidently AI HTML drift reports
βββ scripts/ # Data generation & evaluation utilities
βββ tests/ # Pytest test suite
βββ .github/workflows/ # CI/CD pipeline (lint β test β Docker build)
βββ docker-compose.yml
βββ Dockerfile
βββ params.yaml # Configurable model & training params
βββ requirements.txt
| Layer | Technologies |
|---|---|
| ML / NLP | PyTorch, HuggingFace Transformers, DistilBERT, Scikit-learn |
| Monitoring | Evidently AI (PSI drift detection) |
| Backend | FastAPI, Uvicorn, SQLAlchemy, Pydantic v2 |
| Frontend | React (Vite), Tailwind CSS v4, Chart.js, Lucide Icons |
| DevOps | Docker, Docker Compose, Pytest, GitHub Actions |
Every push triggers an automated pipeline:
Push / PR β Lint (black + flake8) β Tests (pytest) β Docker Build
The pipeline enforces code quality and validates the container builds correctly before any merge β the same pattern used in production ML systems.
Why DistilBERT over a larger model? DistilBERT is 40% smaller and 60% faster than BERT-base with 97% of the performance. For a classification API serving real-time requests, latency matters more than marginal accuracy gains.
Why Evidently AI for drift detection? Most ML systems degrade silently β accuracy drops weeks before anyone notices. PSI-based drift detection catches distribution shifts early, giving teams time to retrain before users are impacted.
Why SQLite for monitoring logs? For a single-instance deployment, SQLite is zero-infrastructure, fast enough for rolling metrics, and trivially replaceable with PostgreSQL when scaling horizontally.
- Hierarchical classification to resolve overlapping categories
- Model versioning with automatic promotion/rollback
- Streaming predictions for large batch jobs
- PostgreSQL migration for multi-instance deployment
- Alerting (Slack/email) on drift threshold breach
Contributions are welcome. Please read CONTRIBUTING.md for guidelines on branch naming, code style, commit conventions, and the PR process.
Distributed under the MIT License. See LICENSE for details.
Built by Emmanuel Nwanguma
LinkedIn Β·
GitHub Β·
Email