Skip to content

Mignon22/industrial-pc-sql-feature-ml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

INDUSTRIAL-PC-SQL-FEATURE-ML

Order-to-delivery demand analysis project using SQL-based feature engineering and baseline modeling.

Project Overview

This repository contains an end-to-end workflow:

  1. Generate synthetic/structured data into a SQLite database
  2. Build features via SQL aggregations
  3. Validate a baseline model and evaluation metrics

Repository Structure

  • notebooks/
    • 01_eda_order_to_delivery.ipynb — EDA & problem framing
    • 02_feature_engineering.ipynb — feature design & validation
    • 03_model_validation.ipynb — baseline model & evaluation
  • src/
    • generate_customers.py — generate customer-related tables
    • generate_products_orders.py — generate products/orders tables
    • step1_sql_features.py — SQL feature engineering pipeline
    • check_db.py — database sanity checks
  • data/
    • raw/ — SQLite DB (excluded from git)
    • processed/ — feature artifacts (excluded from git)
  • outputs/ — figures/results (optional)

Note: Data files (*.db, *.parquet) are excluded from version control via .gitignore.

How to Run (Quick Start)

1) Create environment & install dependencies

pip install -r requirements.txt

2) Generate database (raw data)

python src/generate_customers.py
python src/generate_products_orders.py

3) Build features

python src/step1_sql_features.py

4) Open notebooks

jupyter notebook

Outputs

Figures: outputs/figures/ (if enabled in notebooks)

Tech Stack

  • Python (pandas, numpy)
  • SQLite
  • Jupyter Notebook
  • scikit-learn (baseline modeling)

License

For portfolio / demonstration use.

About

Order-to-delivery risk analysis using SQL-based feature engineering and baseline modeling.

Topics

Resources

Stars

Watchers

Forks

Contributors