DealSense AI: Production ML Deal Risk Intelligence

$1.4M/quarter revenue impact – ML-powered deal risk scoring for B2B SaaS CROs

Random Forest predicts deal loss probability (0-100 score)
Slack alerts + "Exec sponsor TODAY" actions
Production-ready: Airflow ETL + joblib models + API endpoints
Custom metrics uncover sales cycle + rep performance gaps
ROI: Saves 20 high-risk deals/quarter = $1.4M revenue

The Business Problem

Sales cycles up 214%, win rates volatile. Sales leaders need:

Early warning on at-risk pipeline
Explainable risk factors (rep performance, stage delays)
Actionable interventions ("Exec sponsor call TODAY")

DealSense AI delivers daily intelligence that turns data into revenue protection.

System Architecture

How to Run and Use This Project

Follow these steps to set up, run, and use DealSense AI:

Clone the repository

git clone https://github.com/username/dealsense-ai
cd dealsense-ai

Install dependencies
```
pip install -r requirements.txt
```

Run the EDA notebook ( for data exploration)

jupyter notebook 02_eda_insights_skygeni.ipynb

Run the main risk scoring notebook

jupyter notebook notebooks/03_decision_engine_deal_risk.ipynb

Check outputs
- Risk reports and feature importance will be generated in the results/ folder.
- Example: open results/feature_importance.png for feature importance visualization.
Production code
- The main ML engine is in src/deal_risk_scoring.py and can be imported or run as a script for integration into other systems.
Documentation
- See the docs/ folder for business framing, system design, and project reflection.

Project Structure

.
├── docs/
│   ├── 01_problem_framing.md         # Business context and challenge
│   ├── 04_system_design.md           # Architecture 
│   └── 05_reflection.md              # Project learnings and insights
├── notebooks/
│   └── 03_decision_engine_deal_risk.ipynb  # ML risk scoring 
├── results/
│   └── feature_importance.png        # Visualized feature importance
├── src/
│   └── deal_risk_scoring.py          # Main ML scoring logic 
├── 02_eda_insights_skygeni.ipynb     # EDA notebook:data analysis 
├── CRM_to_Risk_Scoring_Engine.png    # System architecture diagram
├── LICENSE                           # Project license
├── README.md                         # Project overview 
├── requirements.txt                  # Python dependencies
└── skygeni_sales_data.csv            # Raw sales data (5K deals)

Note: Documentation files use their actual project file names (e.g., 01_problem_framing.md, 04_system_design.md, 05_reflection.md) to match the naming convention in the docs folder.

Technical Deep Dive

ML Pipeline

Raw CRM Data → Feature Engineering → Hybrid Scoring → Exec Outputs

Feature Engineering (15+ features):

rep_historical_winrate, industry_winrate, deal_age_percentile
is_long_cycle, is_large_deal, lead_source_quality

Hybrid Scoring Engine:

Rule-based (40 pts): Cycle length, rep performance, stage delays
ML-based (60 pts): Random Forest (ROC-AUC validated)
Combined: 0-100 risk score + top 3 factors

Production Features:

joblib model serialization
Airflow/cron scheduling ready
Slack/email alert integration
Fallback logic (rules if ML fails)

Sample Output

CRITICAL: D12345 ($45K ACV)

Risk Score: 87/100 | Level: CRITICAL

Top Factors:

Rep bottom 20% win rate (25 pts)

95 days in Demo (30 pts)

→ ACTION: Exec sponsor TODAY

Business Results

214% sales cycle increase diagnosed
Custom metrics: PVS, REI created
$1.4M/quarter revenue protection potential
Daily exec reports + Slack alerts ready

Production Architecture

Salesforce API → Airflow ETL (6AM) → ML Scoring → Slack/Email → Tableau

↓

CRO Dashboard (real-time)

Full system design: docs/04_system_design.md

Why This Matters

Most sales tech reports data. DealSense AI prescribes actions:

"This rep needs coaching"

"Escalate this deal to VP Sales"

"Partner leads failing → audit channel"

File Descriptions

Problem Framing
docs/01_problem_framing.md
Defines the business context, objectives, and challenges for deal risk scoring.
EDA & Insights
02_eda_insights_skygeni.ipynb
Exploratory data analysis notebook: uncovers key patterns, trends, and actionable insights from the sales data.
Decision Engine & Risk Scoring
notebooks/03_decision_engine_deal_risk.ipynb, src/deal_risk_scoring.py
ML pipeline and production code for scoring deal risk, including model training, evaluation, and demo analysis.
System Design
docs/04_system_design.md
Technical documentation of the system architecture, data flow, and integration points.
Reflection
docs/05_reflection.md
Project learnings, challenges faced, and key takeaways from the development and deployment process.

Tech Stack

ML: scikit-learn, Random Forest, joblib
Visualization: Plotly, Matplotlib
Data: pandas, numpy
Production: Airflow-ready, API endpoints

Final Note

Thank you for exploring DealSense AI! This project is designed to empower sales teams with actionable intelligence, robust analytics, and production-ready machine learning. Whether you’re a data scientist, engineer, or business leader, we hope this solution inspires you to drive smarter decisions and unlock new revenue opportunities.

Innovate boldly, automate wisely, and let data lead your growth!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DealSense AI: Production ML Deal Risk Intelligence

Table of Contents

The Business Problem

System Architecture

How to Run and Use This Project

Project Structure

Technical Deep Dive

ML Pipeline

Sample Output

Business Results

Production Architecture

Why This Matters

File Descriptions

Tech Stack

Final Note

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
notebooks		notebooks
results		results
src		src
02_eda_insights_skygeni.ipynb		02_eda_insights_skygeni.ipynb
CRM_to_Risk_Scoring_Engine.png		CRM_to_Risk_Scoring_Engine.png
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
skygeni_sales_data.csv		skygeni_sales_data.csv

Folders and files

Latest commit

History

Repository files navigation

DealSense AI: Production ML Deal Risk Intelligence

Table of Contents

The Business Problem

System Architecture

How to Run and Use This Project

Project Structure

Technical Deep Dive

ML Pipeline

Sample Output

Business Results

Production Architecture

Why This Matters

File Descriptions

Tech Stack

Final Note

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages