Skip to content

DigitalTouchCode/qs_sales_forecast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QS Sales Forecast - (still work in progress)

CI Python FastAPI Docker Tests Coverage Black License Status


Overview

QS Sales Forecast is a machine learning microservice designed to provide accurate sales forecasting capabilities for the Quicknote SaaS platform. The service analyzes historical transaction data and predicts future product sales across different time horizons.

Built with FastAPI and scikit-learn, this microservice provides RESTful endpoints for sales prediction, model training, and system monitoring. It features automated model retraining, real-time predictions, and comprehensive error handling.


Features

Core Functionality

  • Multi-tenant Support: Handle forecasts for multiple tenants and products
  • Time-based Predictions: Support for next_week, this_month, and next_month forecasts
  • Automated Retraining: Background polling for continuous model improvement
  • Feature Engineering: Automatic generation of lag features, rolling averages, and date-based features
  • Model Persistence: Automatic saving and loading of trained models

Technical Features

  • FastAPI Framework: High-performance async API with automatic documentation
  • Machine Learning: SGDRegressor with StandardScaler for scalable predictions
  • CORS Support: Configurable cross-origin resource sharing
  • Error Handling: Comprehensive error responses and logging
  • Health Monitoring: Built-in status endpoint for system health checks

Model Features

  • Lag Features: Sales data from previous periods (1, 7 days)
  • Rolling Averages: 7-day rolling mean calculations
  • Date Features: Day of week, month, and day of month extraction
  • Incremental Learning: Models update with new data without full retraining

Quick Start

Prerequisites

  • Python 3.12+
  • pip or poetry
  • Git

Installation

  1. Clone the repository

    git clone https://github.com/digitaltouchcode/qs_sales_forecast.git
    cd qs_sales_forecast
  2. Set up virtual environment

    python -m venv env
    source env/bin/activate  # On Windows: env\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Run the server

    cd app
    uvicorn main:app --reload

The API will be available at http://localhost:8000

Docker Setup

  1. Build and run with Docker Compose

    docker-compose up --build
  2. Access the API

    • API: http://localhost:8000
    • Documentation: http://localhost:8000/docs

API Documentation

Base URL

http://localhost:8000

Endpoints

Predict Sales

GET /predict

Parameters:

  • tenant_id (string): Tenant identifier
  • product (string): Product name
  • target (string): Forecast period (next_week, this_month, next_month)

Example:

curl "http://localhost:8000/predict?tenant_id=tenant_a&product=apples&target=next_week"

Response:

{
  "tenant_id": "tenant_a",
  "product": "apples",
  "target": "next_week",
  "forecast": [
    {"date": "2024-01-22", "predicted_units": 25.5},
    {"date": "2024-01-23", "predicted_units": 27.2},
    ...
  ]
}

Train Models

POST /train

Description: Force retraining of all models with current data

Response:

{
  "status": "ok",
  "message": "Training completed."
}

System Status

GET /status

Response:

{
  "models": [
    {
      "tenant_id": "tenant_a",
      "product": "apples",
      "trained": true,
      "rows_seen": 1500
    }
  ]
}

Architecture

Project Structure

qs_sales_forecast/
├── app/
│   ├── main.py                 # FastAPI application entry point
│   ├── services/
│   │   ├── model_store.py      # Model management and predictions
│   │   ├── features.py         # Feature engineering
│   │   └── poller.py           # Background data polling
│   ├── data/
│   │   └── sales_data.csv      # Sample training data
│   ├── saved_models/           # Trained model storage
│   └── tests/                  # Comprehensive test suite
├── docker-compose.yml          # Docker configuration
├── Dockerfile                  # Container build file
├── requirements.txt            # Python dependencies
└── README.md                   # This file

Data Flow

  1. Data Ingestion: CSV files are monitored for new sales data
  2. Feature Engineering: Automatic creation of time-based and lag features
  3. Model Training: Incremental learning with SGDRegressor
  4. Prediction: Real-time forecasting via REST API
  5. Model Persistence: Automatic saving of trained models

Model Details

  • Algorithm: SGDRegressor (Stochastic Gradient Descent)
  • Features: 6 engineered features including lag variables and date components
  • Scaling: StandardScaler for feature normalization
  • Training: Incremental learning with partial_fit for efficiency

Development

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=app --cov-report=html

# Run specific test file
pytest app/tests/test_model_store.py

Code Quality

# Format code
black app/

# Sort imports
isort app/

# Lint code
flake8 app/

Test Coverage

  • Tests: 39 test cases
  • Coverage Areas: API endpoints, model training, feature engineering, error handling

Configuration

Environment Variables

  • No environment variables required for basic setup
  • Models are stored in app/saved_models/
  • Data files expected in app/data/

Model Parameters

  • Polling Interval: 30 seconds (configurable in poller.py)
  • Minimum Training Data: 10 rows per tenant/product
  • Lag Periods: 1 and 7 days
  • Rolling Window: 7 days

Deployment

Production Deployment

  1. Environment Setup

    export PYTHONPATH=/path/to/app
    cd app
  2. Run with Gunicorn

    gunicorn main:app -w 4 -k uvicorn.workers.UvicornWorker
  3. Docker Production

    docker build -t qs-sales-forecast .
    docker run -p 8000:8000 qs-sales-forecast

Monitoring

  • Health checks available at /status
  • Application logs include model training and prediction metrics
  • Error logging for debugging and monitoring

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add tests for new features
  • Update documentation for API changes

Performance

Benchmarks

  • API Response Time: <50ms for predictions
  • Model Training: <1s for typical datasets
  • Memory Usage: <100MB for loaded models
  • Concurrent Requests: Supports 100+ concurrent predictions

Scalability

  • Horizontal scaling via containerization
  • Model caching for improved performance
  • Incremental learning reduces training overhead
  • Efficient feature engineering with pandas optimizations

Troubleshooting

Common Issues

Models not loading?

  • Check that app/saved_models/ contains .joblib files
  • Verify model filenames follow tenant__product.joblib format

Predictions returning 404?

  • Ensure models are trained for the specific tenant/product combination
  • Check /status endpoint for available models

Training errors?

  • Verify CSV data has required columns: tenant_id, product, date, units_sold
  • Ensure sufficient data (10+ rows) per tenant/product

Logs

Application logs include detailed information about:

  • Model loading and training
  • API requests and responses
  • Error conditions and stack traces

Built for the Quicknote SaaS platform

Releases

No releases published

Packages

 
 
 

Contributors