π Project Overview
This repository documents the full analytical journeyβfrom the initial V1.0 feasibility study to the validated V3.0 model deployment strategy. The core objective remains achieving portfolio viability, but the strategic path has evolved from marginal pricing adjustments to implementing a definitive, high-confidence strategic cut-off.
It includes all core project artifacts, model code, and documentation across iterations.
Chain_Forward_Risk_Assessment/
βββ 1_Presentation/
β βββ Achieving Profitability, Risk Model V3.0 Findings & Strategic Cut-Off Recommendation.pptx
β βββ Chain Forward Profitability Analysis, Strategic Path to Viability.pbix
β βββ Chain Forward Risk Assessment & Profitability Study.pptx
β
βββ 2_Code_and_Data/
β βββ data/
β β βββ Loan_Snapshot_Dataset.xlsx
β β βββ simulated_msme_data.csv
β β
β βββ outputs/
β β βββ chart_feature_importance.csv
β β βββ chart_feature_importance_v3.csv
β β βββ chart_segment_performance.csv
β β βββ chart_segment_performance_v3.csv
β β βββ combined_features_v3.csv
β β βββ combined_risk_dataset_final_v4.csv
β β βββ sample_combined_v4.csv
β β βββ scenario_analysis_results.csv
β β βββ scenario_analysis_results_v3.csv
β β βββ segmentation_summary.csv
β β βββ segmentation_summary_v3.csv
β β βββ models/
β β βββ risk_model_v2.pkl
β β βββ scaler_v2.pkl
β β
β βββ src/
β βββ chain_forward_risk_model.py
β βββ data_combination_pipeline.py
β βββ data_combination_pipeline_v2.py
β βββ risk_modeling_pipeline_v2.0.py
β
βββ 3_Documentation/
β βββ Chain Forward Profitability Analysis, Strategic Path to Viability.pdf
β βββ Model Governance and Strategic Roadmap.pdf
β βββ Risk Management & Monitoring Framework (V3.0 - Post-Deployment).pdf
β
βββ 4_Monitoring_Dashboard/
β βββ governance_dashboard.html
β
βββ Readme.md
βββ run_full_pipeline.ps1The V3.0 model provides a high-confidence solution:
A Hard Cut-Off (Filtering) of the most loss-concentrated segment (Segment 0: CRITICAL RISK) is required to reduce the portfolio-wide default rate below the 3.75% break-even point and achieve immediate positive NPV.
| Metric | Result | Implication |
|---|---|---|
| Net Present Value (NPV) | β$6,745,425 (3-year horizon) | Product is not financially viable in current form |
| Break-Even Default Rate | 3.75% | Portfolio must stay below this risk level |
| Current Expected Default Rate | 6.00% | Product operates 2.25 percentage points above sustainability |
The initial modeling focused heavily on non-traditional, behavioral variables, finding the model's rejection power was weak (AUC < 0.6). This led to a strategy centered on marginal pricing adjustments.
V1.0 followed a structured four-pillar approach:
- Portfolio Viability Modeling
- Behavioral Risk Analytics
- K-Means Segmentation
- Logistic Regression for Default Prediction
| Metric | V1.0 Result | Strategic Recommendation |
|---|---|---|
| Dominant Risk Driver | Cashflow Volatility Ratio (+0.3347) | Implement risk-based pricing premiums |
| Model Limit | ROC AUC = 0.5962 (Weak) | Pricing adjustments needed before stricter modeling |
| V1.0 Segmentation | Segment 1 (6.66% DR, 48.5% share) | Aggressive pricing required |
Loss concentration: Nearly half the borrowers belonged to a high-risk segment above the break-even threshold.
Refined feature engineering and model tuning yielded a high-confidence predictive model (AUC 0.8820), validating traditional credit signals and enabling a decisive cut-off strategy.
| Metric | V3.0 Result | Strategic Implication |
|---|---|---|
| Model Performance | ROC AUC = 0.8820 (Excellent) | High-confidence segmentation and rejection rules |
| Top Risk Drivers | Max Days in Arrears (+1.94), Max Utilization (+1.23) | Traditional credit signals dominate |
| Loss Concentration | Segment 0: 86.1% DR, 40% volume | Segment 0 must be eliminated |
| Segment | Risk Classification | Default Rate | Portfolio Share |
|---|---|---|---|
| Segment 0 | CRITICAL RISK | 86.1% | 40% |
| Segment 2 | High Risk | 61.6% | 20% |
| Segment 1 | Base Risk | 35.8% | 40% |
The V3.0 strategy is a 3-part plan to immediately reduce portfolio risk and then optimize pricing and retention.
Objective: Achieve positive NPV by forcing the portfolio DR below 3.75%.
| Action | Target Segment | Strategic Action |
|---|---|---|
| Filter (Hard Cut-Off) | Segment 0 | Auto-decline based on arrears/utilization profile |
| Price | Segment 2 | Apply aggressive risk-based pricing |
| Retain | Segment 1 | Preferential pricing, core segment |
| Scenario | Hypothesis | Post-Filter DR | 3-Year NPV |
|---|---|---|---|
| Optimistic | Filter Segment 0 (40% volume) | 4.5% | +$750,000 |
| Base Case | No filtering | 6.0% | β$6,745,425 |
- Deploy the V3.0 model and Segment 0 rejection rule into the lending platform
- Setup Early Warning Indicators
- Future exploration: XGBoost / LightGBM
The repository includes a PowerShell script:
run_full_pipeline.ps1
Runs the entire workflow end-to-end.
.\run_full_pipeline.ps1run_full_pipeline.ps1
β
βββ STEP 1 β data_combination_pipeline.py
β β’ Load data
β β’ Clean & engineer features
β β’ Export initial combined dataset
β
βββ STEP 2 β data_combination_pipeline_v2.0.py
β β’ Refine feature engineering
β β’ Apply V3.0 target definition
β β’ Export final combined dataset for modeling
β
βββ STEP 3 β chain_forward_risk_model.py
β β’ K-means segmentation
β β’ Logistic regression modeling
β β’ Profitability & NPV analysis
β β’ Scenario stress-testing
β β’ Output generation
β
βββ STEP 4 β risk_modeling_pipeline_v2.0.py
β’ Model validation and fine-tuning
β’ Scenario and stress-test replication
β’ Final output generation for monitoringFile: 4_Monitoring_Dashboard/governance_dashboard.html
- Serve the folder from terminal:
cd 4_Monitoring_Dashboard
python -m http.server 8000- Open your browser and navigate to:
http://localhost:8000/governance_dashboard.html
β οΈ Using forward slashes/in the URL is essential to avoid 404 errors from the server.
- Open the link in a web browser.
- Navigate through tabs to review:
- Portfolio segmentation and risk exposure
- Feature importance and model explainability
- Scenario stress-testing outcomes
- Use this dashboard for executive reporting and governance purposes.
This allows non-technical stakeholders to monitor portfolio health and model performance without needing to run scripts.
| Folder | Description |
|---|---|
| 1_Presentation/ | Final PowerPoint presentation and Power BI visuals |
| 2_Code_and_Data/src/ | Core Python scripts for modeling and scenario analysis |
| 2_Code_and_Data/outputs/ | CSV outputs for segmentation, profitability, and feature weights |
| 3_Documentation/ | Supporting notes, assumptions, frameworks |
| Tool | Purpose |
|---|---|
| Python | Core analytics & modeling |
| Pandas / NumPy | Data engineering & financial calculations |
| Scikit-learn | Clustering + Logistic Regression modeling |
| Power BI | Executive-ready dashboard and visualization |