QS Sales Forecast is a machine learning microservice designed to provide accurate sales forecasting capabilities for the Quicknote SaaS platform. The service analyzes historical transaction data and predicts future product sales across different time horizons.
Built with FastAPI and scikit-learn, this microservice provides RESTful endpoints for sales prediction, model training, and system monitoring. It features automated model retraining, real-time predictions, and comprehensive error handling.
- Multi-tenant Support: Handle forecasts for multiple tenants and products
- Time-based Predictions: Support for
next_week,this_month, andnext_monthforecasts - Automated Retraining: Background polling for continuous model improvement
- Feature Engineering: Automatic generation of lag features, rolling averages, and date-based features
- Model Persistence: Automatic saving and loading of trained models
- FastAPI Framework: High-performance async API with automatic documentation
- Machine Learning: SGDRegressor with StandardScaler for scalable predictions
- CORS Support: Configurable cross-origin resource sharing
- Error Handling: Comprehensive error responses and logging
- Health Monitoring: Built-in status endpoint for system health checks
- Lag Features: Sales data from previous periods (1, 7 days)
- Rolling Averages: 7-day rolling mean calculations
- Date Features: Day of week, month, and day of month extraction
- Incremental Learning: Models update with new data without full retraining
- Python 3.12+
- pip or poetry
- Git
-
Clone the repository
git clone https://github.com/digitaltouchcode/qs_sales_forecast.git cd qs_sales_forecast -
Set up virtual environment
python -m venv env source env/bin/activate # On Windows: env\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Run the server
cd app uvicorn main:app --reload
The API will be available at http://localhost:8000
-
Build and run with Docker Compose
docker-compose up --build
-
Access the API
- API:
http://localhost:8000 - Documentation:
http://localhost:8000/docs
- API:
http://localhost:8000
GET /predictParameters:
tenant_id(string): Tenant identifierproduct(string): Product nametarget(string): Forecast period (next_week,this_month,next_month)
Example:
curl "http://localhost:8000/predict?tenant_id=tenant_a&product=apples&target=next_week"Response:
{
"tenant_id": "tenant_a",
"product": "apples",
"target": "next_week",
"forecast": [
{"date": "2024-01-22", "predicted_units": 25.5},
{"date": "2024-01-23", "predicted_units": 27.2},
...
]
}POST /trainDescription: Force retraining of all models with current data
Response:
{
"status": "ok",
"message": "Training completed."
}GET /statusResponse:
{
"models": [
{
"tenant_id": "tenant_a",
"product": "apples",
"trained": true,
"rows_seen": 1500
}
]
}qs_sales_forecast/
├── app/
│ ├── main.py # FastAPI application entry point
│ ├── services/
│ │ ├── model_store.py # Model management and predictions
│ │ ├── features.py # Feature engineering
│ │ └── poller.py # Background data polling
│ ├── data/
│ │ └── sales_data.csv # Sample training data
│ ├── saved_models/ # Trained model storage
│ └── tests/ # Comprehensive test suite
├── docker-compose.yml # Docker configuration
├── Dockerfile # Container build file
├── requirements.txt # Python dependencies
└── README.md # This file
- Data Ingestion: CSV files are monitored for new sales data
- Feature Engineering: Automatic creation of time-based and lag features
- Model Training: Incremental learning with SGDRegressor
- Prediction: Real-time forecasting via REST API
- Model Persistence: Automatic saving of trained models
- Algorithm: SGDRegressor (Stochastic Gradient Descent)
- Features: 6 engineered features including lag variables and date components
- Scaling: StandardScaler for feature normalization
- Training: Incremental learning with partial_fit for efficiency
# Run all tests
pytest
# Run with coverage
pytest --cov=app --cov-report=html
# Run specific test file
pytest app/tests/test_model_store.py# Format code
black app/
# Sort imports
isort app/
# Lint code
flake8 app/- Tests: 39 test cases
- Coverage Areas: API endpoints, model training, feature engineering, error handling
- No environment variables required for basic setup
- Models are stored in
app/saved_models/ - Data files expected in
app/data/
- Polling Interval: 30 seconds (configurable in
poller.py) - Minimum Training Data: 10 rows per tenant/product
- Lag Periods: 1 and 7 days
- Rolling Window: 7 days
-
Environment Setup
export PYTHONPATH=/path/to/app cd app
-
Run with Gunicorn
gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker
-
Docker Production
docker build -t qs-sales-forecast . docker run -p 8000:8000 qs-sales-forecast
- Health checks available at
/status - Application logs include model training and prediction metrics
- Error logging for debugging and monitoring
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow PEP 8 style guidelines
- Add tests for new features
- Update documentation for API changes
- API Response Time: <50ms for predictions
- Model Training: <1s for typical datasets
- Memory Usage: <100MB for loaded models
- Concurrent Requests: Supports 100+ concurrent predictions
- Horizontal scaling via containerization
- Model caching for improved performance
- Incremental learning reduces training overhead
- Efficient feature engineering with pandas optimizations
Models not loading?
- Check that
app/saved_models/contains.joblibfiles - Verify model filenames follow
tenant__product.joblibformat
Predictions returning 404?
- Ensure models are trained for the specific tenant/product combination
- Check
/statusendpoint for available models
Training errors?
- Verify CSV data has required columns:
tenant_id,product,date,units_sold - Ensure sufficient data (10+ rows) per tenant/product
Application logs include detailed information about:
- Model loading and training
- API requests and responses
- Error conditions and stack traces
Built for the Quicknote SaaS platform