QS Sales Forecast - (still work in progress)

Overview

QS Sales Forecast is a machine learning microservice designed to provide accurate sales forecasting capabilities for the Quicknote SaaS platform. The service analyzes historical transaction data and predicts future product sales across different time horizons.

Built with FastAPI and scikit-learn, this microservice provides RESTful endpoints for sales prediction, model training, and system monitoring. It features automated model retraining, real-time predictions, and comprehensive error handling.

Features

Core Functionality

Multi-tenant Support: Handle forecasts for multiple tenants and products
Time-based Predictions: Support for next_week, this_month, and next_month forecasts
Automated Retraining: Background polling for continuous model improvement
Feature Engineering: Automatic generation of lag features, rolling averages, and date-based features
Model Persistence: Automatic saving and loading of trained models

Technical Features

FastAPI Framework: High-performance async API with automatic documentation
Machine Learning: SGDRegressor with StandardScaler for scalable predictions
CORS Support: Configurable cross-origin resource sharing
Error Handling: Comprehensive error responses and logging
Health Monitoring: Built-in status endpoint for system health checks

Model Features

Lag Features: Sales data from previous periods (1, 7 days)
Rolling Averages: 7-day rolling mean calculations
Date Features: Day of week, month, and day of month extraction
Incremental Learning: Models update with new data without full retraining

Quick Start

Prerequisites

Python 3.12+
pip or poetry
Git

Installation

Clone the repository

git clone https://github.com/digitaltouchcode/qs_sales_forecast.git
cd qs_sales_forecast

Set up virtual environment

python -m venv env
source env/bin/activate  # On Windows: env\Scripts\activate

Install dependencies
```
pip install -r requirements.txt
```
Run the server
```
cd app
uvicorn main:app --reload
```

The API will be available at http://localhost:8000

Docker Setup

Build and run with Docker Compose
```
docker-compose up --build
```
Access the API
- API: http://localhost:8000
- Documentation: http://localhost:8000/docs

API Documentation

Base URL

http://localhost:8000

Endpoints

Predict Sales

GET /predict

Parameters:

tenant_id (string): Tenant identifier
product (string): Product name
target (string): Forecast period (next_week, this_month, next_month)

Example:

curl "http://localhost:8000/predict?tenant_id=tenant_a&product=apples&target=next_week"

Response:

{
  "tenant_id": "tenant_a",
  "product": "apples",
  "target": "next_week",
  "forecast": [
    {"date": "2024-01-22", "predicted_units": 25.5},
    {"date": "2024-01-23", "predicted_units": 27.2},
    ...
  ]
}

Train Models

POST /train

Description: Force retraining of all models with current data

Response:

{
  "status": "ok",
  "message": "Training completed."
}

System Status

GET /status

Response:

{
  "models": [
    {
      "tenant_id": "tenant_a",
      "product": "apples",
      "trained": true,
      "rows_seen": 1500
    }
  ]
}

Architecture

Project Structure

qs_sales_forecast/
├── app/
│   ├── main.py                 # FastAPI application entry point
│   ├── services/
│   │   ├── model_store.py      # Model management and predictions
│   │   ├── features.py         # Feature engineering
│   │   └── poller.py           # Background data polling
│   ├── data/
│   │   └── sales_data.csv      # Sample training data
│   ├── saved_models/           # Trained model storage
│   └── tests/                  # Comprehensive test suite
├── docker-compose.yml          # Docker configuration
├── Dockerfile                  # Container build file
├── requirements.txt            # Python dependencies
└── README.md                   # This file

Data Flow

Data Ingestion: CSV files are monitored for new sales data
Feature Engineering: Automatic creation of time-based and lag features
Model Training: Incremental learning with SGDRegressor
Prediction: Real-time forecasting via REST API
Model Persistence: Automatic saving of trained models

Model Details

Algorithm: SGDRegressor (Stochastic Gradient Descent)
Features: 6 engineered features including lag variables and date components
Scaling: StandardScaler for feature normalization
Training: Incremental learning with partial_fit for efficiency

Development

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=app --cov-report=html

# Run specific test file
pytest app/tests/test_model_store.py

Code Quality

# Format code
black app/

# Sort imports
isort app/

# Lint code
flake8 app/

Test Coverage

Tests: 39 test cases
Coverage Areas: API endpoints, model training, feature engineering, error handling

Configuration

Environment Variables

No environment variables required for basic setup
Models are stored in app/saved_models/
Data files expected in app/data/

Model Parameters

Polling Interval: 30 seconds (configurable in poller.py)
Minimum Training Data: 10 rows per tenant/product
Lag Periods: 1 and 7 days
Rolling Window: 7 days

Deployment

Production Deployment

Environment Setup
```
export PYTHONPATH=/path/to/app
cd app
```

Run with Gunicorn

gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker

Docker Production

docker build -t qs-sales-forecast .
docker run -p 8000:8000 qs-sales-forecast

Monitoring

Health checks available at /status
Application logs include model training and prediction metrics
Error logging for debugging and monitoring

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow PEP 8 style guidelines
Add tests for new features
Update documentation for API changes

Performance

Benchmarks

API Response Time: <50ms for predictions
Model Training: <1s for typical datasets
Memory Usage: <100MB for loaded models
Concurrent Requests: Supports 100+ concurrent predictions

Scalability

Horizontal scaling via containerization
Model caching for improved performance
Incremental learning reduces training overhead
Efficient feature engineering with pandas optimizations

Troubleshooting

Common Issues

Models not loading?

Check that app/saved_models/ contains .joblib files
Verify model filenames follow tenant__product.joblib format

Predictions returning 404?

Ensure models are trained for the specific tenant/product combination
Check /status endpoint for available models

Training errors?

Verify CSV data has required columns: tenant_id, product, date, units_sold
Ensure sufficient data (10+ rows) per tenant/product

Logs

Application logs include detailed information about:

Model loading and training
API requests and responses
Error conditions and stack traces

Built for the Quicknote SaaS platform

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
app		app
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

QS Sales Forecast - (still work in progress)

Overview

Features

Core Functionality

Technical Features

Model Features

Quick Start

Prerequisites

Installation

Docker Setup

API Documentation

Base URL

Endpoints

Predict Sales

Train Models

System Status

Architecture

Project Structure

Data Flow

Model Details

Development

Running Tests

Code Quality

Test Coverage

Configuration

Environment Variables

Model Parameters

Deployment

Production Deployment

Monitoring

Contributing

Development Guidelines

Performance

Benchmarks

Scalability

Troubleshooting

Common Issues

Logs

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages