Predictive Sprint Planning & Risk Mitigation Platform
SprintGuard uses machine learning to predict risk levels of user stories, helping Agile teams avoid estimation failure and scope creep.
- Python 3.9 or higher
- pip (Python package manager)
- Clone or navigate to the project directory:
cd /home/jovyan/SprintGuard- Create and activate virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies incrementally:
# Core web application (required)
pip install -r requirements.txt
# Data augmentation pipeline (required for first-time setup)
pip install -r requirements-augmentation.txt
# ML model training and inference (required for risk prediction)
pip install -r requirements-ml.txt
python -m spacy download en_core_web_sm
# Development tools (optional)
pip install -r requirements-dev.txtBefore running the application, you need to augment the NeoDataset (~20K user stories from HuggingFace) with risk labels:
# This downloads NeoDataset and applies weak supervision pipeline
# Takes ~15-30 minutes
python scripts/augment_neodataset.pyThis creates:
data/neodataset_augmented.csv- Full augmented datasetdata/neodataset_augmented_high_confidence.csv- High-confidence subset
After augmentation, train the DistilBERT-XGBoost risk model:
./scripts/train_ml_model.shThis script will:
- Check for and create augmented dataset if needed
- Download spaCy model if missing
- Train the model with proper PYTHONPATH
- Run validation tests
Model artifacts are saved to the models/ directory.
python app.pyOpen your browser: http://localhost:5001
Assesses the quality and quantity of your historical data to set realistic expectations about prediction accuracy.
Analyzes new user stories and assigns risk levels (Low/Medium/High) based on ML models trained on real-world data.
Models the timeline impact of adding new work to a sprint, making scope creep costs tangible.
- Backend: Python 3.9+ with Flask 3.0
- Data Source: Augmented NeoDataset (~20K real user stories)
- ML Pipeline: Snorkel (weak supervision) + Cleanlab (noise filtering)
- Risk Model: DistilBERT-XGBoost with SHAP explainability
Comprehensive documentation is available in the docs/ directory:
- SETUP.md - Detailed installation and configuration guide
- AUGMENTATION_STATUS.md - NeoDataset augmentation pipeline details
- ML_MODEL_GUIDE.md - ML model training and usage
- ML_ARCHITECTURE.md - Technical architecture of ML components
- IMPLEMENTATION_SUMMARY.md - Full implementation overview
- research/ - Research notes on ML techniques
GET /api/health-check- Data quality assessmentPOST /api/assess-risk- Story risk predictionPOST /api/simulate-scope- Timeline impact simulationGET /api/stories- Historical stories retrievalGET /api/info- System information
pip install -r requirements-dev.txt
pytest
pytest --cov=src --cov-report=html # With coverageSprintGuard/
├── app.py # Flask application
├── config.py # Configuration
├── requirements*.txt # Dependencies (core, augmentation, ML, dev)
├── src/
│ ├── analyzers/ # Risk assessment, health check, scope simulation
│ ├── ml/ # ML pipeline (augmentation, training, inference)
│ ├── models/ # Data models (Story)
│ └── utils/ # Utilities
├── scripts/
│ ├── augment_neodataset.py # Main augmentation script
│ ├── train_ml_model.sh # Model training script
│ └── explore_neodataset.py # Data exploration tool
├── tests/ # Unit tests
├── docs/ # Documentation
└── data/ # Data files (generated)
- Dynamic Resource Forecaster - Skill-based bottleneck detection
- Jira Cloud Integration - Real-time API connection
- Team Calibration Tool - Improve estimation consistency
- Advanced ML Models - Deep learning for pattern recognition
- Custom Dashboards - Exportable reports for stakeholders
python scripts/augment_neodataset.pyEdit config.py and change PORT = 5001 to another value.
Use the training script which handles PYTHONPATH automatically:
./scripts/train_ml_model.shOr if running Python directly, set PYTHONPATH first:
export PYTHONPATH="$(pwd):$PYTHONPATH"
python src/ml/train_risk_model.pypython -m spacy download en_core_web_smThis Proof of Concept is provided as-is for educational purposes.
Built with ❤️ to help Agile teams break the cycle of estimation failure and scope creep.