An end‑to‑end ML workflow on the Titanic dataset, demonstrating data loading, EDA, model training, evaluation, serving, containerization, and CI.
-
Exploratory Data Analysis
Interactive Jupyter notebook with data profiling and visualizations. -
Modular Codebase
Python modules undersrc/for data, features, models, training, evaluation, and API. -
Hyperparameter Tuning
Grid search over model parameters (configurable viaconfig/train_config.yaml). -
Model Serving
FastAPI endpoint at/predictfor live inference. -
Containerization
Dockerfile to build and run the app in a container. -
CI/CD
GitHub Actions workflow for linting (flake8) and testing (pytest) on every push.
-
Clone the repo
git clone https://github.com/<your-username>/ml-playground.git cd ml-playground
-
Create & activate a virtual environment
python3 -m venv venv source venv/bin/activate -
Install dependencies
pip install -r requirements.txt
Open the Jupyter notebook:
jupyter notebook notebooks/exploratory_data_analysis.ipynb- Training:
python src/train.py --config config/train_config.yaml
- Evaluation:
python src/evaluate.py --model-path models/best_model.pkl
Start FastAPI server:
uvicorn src.api:app --reload- Endpoint:
POST /predictwith JSON payload of features.
Build and run:
docker build -t ml-playground .
docker run -p 8000:8000 ml-playgroundGitHub Actions workflow (.github/workflows/ci.yaml) auto‑runs flake8 and pytest on every push.
Feel free to open issues or submit PRs to improve the pipeline, add new models, or enhance deployment.
This project is MIT‑licensed. See LICENSE for details.