⚠️ Medical Disclaimer: This system is an AI-based screening prototype and does not replace professional medical diagnosis or clinical judgment. Always consult a licensed healthcare provider.
- What Is This?
- How It Works
- Features
- Architecture Overview
- Project Structure
- Prerequisites
- Quick Start (Recommended)
- API Reference
- Model Training (Optional)
- Troubleshooting
- Tech Stack
- Contributing
- Medical Disclaimer
PreTermAI is a full-stack AI screening system that predicts the risk of preterm birth (delivery before 37 weeks) based on maternal clinical data.
Preterm birth affects ~10% of pregnancies worldwide and is a leading cause of neonatal morbidity. Early risk identification allows clinicians to take preventive action and improve outcomes.
This system takes clinical inputs (e.g., gestational age, cervical length, prior history), runs them through a trained XGBoost machine learning model, and returns:
- 🔴 / 🟡 / 🟢 Risk classification — High / Moderate / Low
- 📊 Confidence score and risk probability
- 🤖 AI-generated health summary with personalized recommendations
- 📈 Feature contribution chart showing what factors drove the prediction
- 📄 Downloadable PDF report for clinical use
User fills clinical form
↓
Next.js Frontend (localhost:3000)
↓
Next.js API Routes (/api/predict, /api/report-pdf)
↓
Flask Backend (localhost:5000)
↓
XGBoost ML Model + SHAP Explainer
↓
Risk Score + AI Summary + Feature Chart
↓
Response rendered in browser / PDF export
The ML pipeline uses:
- A pre-trained XGBoost classifier (
models/xgboost_model.json) - A scikit-learn scaler (
models/scaler.pkl) to normalize inputs - SHAP values via
explain.pyfor feature importance visualization - An AI summary engine (
ai_summary.py) to generate human-readable clinical notes
| Feature | Description |
|---|---|
| 🎯 Risk Prediction | Classifies risk as Low, Moderate, or High with a probability score |
| 📊 Confidence Score | Shows how confident the model is in its prediction |
| 🤖 AI Health Summary | Generates natural-language recommendations based on inputs |
| 📈 Feature Contribution Chart | Visual breakdown of which clinical factors influenced the result |
| 🗓️ Risk Timeline | Shows how risk evolves across gestational weeks |
| 📄 PDF Report Download | Exportable clinical report for documentation |
| 🔍 Model Transparency Panel | Displays model metadata, version, and accuracy metrics |
preterm-birth-detection/
│
├── 🐍 Flask Backend (Python) ← REST API + ML inference
│ ├── app.py ← Main Flask server
│ ├── model.py ← Model training script
│ ├── risk_engine.py ← Risk classification logic
│ ├── explain.py ← SHAP feature explainability
│ ├── ai_summary.py ← AI-generated health summaries
│ └── requirements.txt ← Python dependencies
│
├── ⚛️ Next.js Frontend (TypeScript) ← User interface
│ └── frontend/preterm-ai/
│ ├── pages/ or app/ ← Next.js routes
│ ├── components/ ← UI components
│ └── api/ ← API proxy routes to Flask
│
├── 🧠 ML Artifacts
│ └── models/
│ ├── best_model.pkl ← Serialized best model
│ ├── scaler.pkl ← Feature scaler
│ └── xgboost_model.json ← XGBoost model weights
│
├── 📊 Metadata & Metrics
│ ├── model_metadata.json ← Model version, features, config
│ └── model_metrics.json ← Accuracy, AUC, precision, recall
│
└── 📝 Documentation
├── README.md
└── REPORT.md ← Detailed project report
Before running this project, make sure you have:
| Tool | Version | Check Command |
|---|---|---|
| Python | 3.9 or higher | python --version |
| pip | latest | pip --version |
| Node.js | 18 or higher | node --version |
| npm | 9 or higher | npm --version |
| Git | any | git --version |
💡 Tip: Using a Python virtual environment (
venv) is strongly recommended to avoid dependency conflicts.
git clone https://github.com/codexrahulKIIT/preterm-birth-detection.git
cd preterm-birth-detectionOn macOS / Linux:
python3 -m venv .venv
source .venv/bin/activateOn Windows (Command Prompt):
python -m venv .venv
.venv\Scripts\activateOn Windows (PowerShell):
python -m venv .venv
.venv\Scripts\Activate.ps1pip install -r requirements.txtpython app.pyYou should see:
* Running on http://127.0.0.1:5000
* Debug mode: off
✅ Keep this terminal open — the backend must stay running.
Open a new terminal window and run:
cd frontend/preterm-ai
npm install
npm run devYou should see:
▲ Next.js 14.x.x
- Local: http://localhost:3000
- Ready in Xs
✅ Keep this terminal open too.
Navigate to http://localhost:3000 in your browser.
Fill in the maternal clinical data form and click Predict Risk to see results.
The Flask backend exposes the following REST endpoints at http://127.0.0.1:5000:
Health check to verify the backend is running.
Response:
{
"status": "ok",
"message": "PreTermAI backend is running"
}Returns model metadata including version, features used, and training configuration.
Response:
{
"model_version": "...",
"features": [...],
"trained_at": "...",
...
}Main prediction endpoint. Accepts clinical input features and returns a risk assessment.
Request Body (JSON):
{
"gestational_age": 28,
"cervical_length": 25,
"prior_preterm_birth": 1,
"uterine_contractions": 3,
...
}Response:
{
"risk_level": "High",
"risk_score": 0.84,
"confidence": 0.91,
"feature_contributions": { ... },
"ai_summary": "Based on the clinical data..."
}Returns a structured JSON report suitable for rendering a summary view.
Generates and returns a downloadable PDF clinical report.
Content-Type: application/pdf
If you have a dataset CSV and want to retrain the model from scratch:
# Activate your virtual environment first
source .venv/bin/activate # macOS/Linux
# OR
.venv\Scripts\activate # Windows
# Run the training script
python model.py "/path/to/your/dataset.csv"This will:
- Load and preprocess the dataset
- Train an XGBoost classifier
- Evaluate and save the best model
- Update
models/best_model.pkl,models/scaler.pkl,models/xgboost_model.json - Regenerate
model_metadata.jsonandmodel_metrics.json
⚠️ Note: If you see a scikit-learn version warning when loading the pre-trained model, it means the.pklartifact was saved with a different sklearn version. Retraining in your current environment will fix this.
❌ Flask backend not starting
- Make sure your virtual environment is activated before running
python app.py - Ensure all dependencies are installed:
pip install -r requirements.txt - Check that port
5000is not occupied by another process
❌ Frontend can't connect to backend
- Confirm Flask is running at
http://127.0.0.1:5000 - Check browser console for CORS errors — the Flask app should allow localhost:3000
- Make sure you ran
npm installbeforenpm run dev
❌ sklearn / model version warning
- This happens when the
.pklfile was created with a different version of scikit-learn - Fix: Retrain the model with
python model.py <path_to_csv>in your current environment
❌ npm install errors on Windows
- Try running the terminal as Administrator
- Ensure Node.js 18+ is installed:
node --version - Delete
node_modulesandpackage-lock.json, then retrynpm install
❌ Port already in use
On macOS/Linux:
lsof -i :5000 # find process using port 5000
kill -9 <PID>On Windows:
netstat -ano | findstr :5000
taskkill /PID <PID> /F| Layer | Technology |
|---|---|
| Frontend UI | Next.js 14, TypeScript, React, Tailwind CSS |
| Backend API | Python, Flask |
| Machine Learning | XGBoost, scikit-learn |
| Explainability | SHAP (SHapley Additive exPlanations) |
| Report Generation | PDF export via backend |
| Model Persistence | Pickle (.pkl), XGBoost JSON |
| API Communication | REST / JSON |
Contributions, issues and feature requests are welcome!
- Fork the repository
- Create a new branch:
git checkout -b feature/your-feature-name - Make your changes and commit:
git commit -m "Add your feature" - Push to your branch:
git push origin feature/your-feature-name - Open a Pull Request
Please ensure your code follows existing conventions and that the backend/frontend both start cleanly before submitting.
PreTermAI is a research and screening prototype only.
It is not a certified medical device and should not be used as the sole basis for any clinical decision.
All predictions must be interpreted by a qualified healthcare professional.
The authors accept no liability for medical decisions made based on this tool.
Made with ❤️ by codexrahulKIIT
⭐ Star this repo if you found it useful!