Customer churn is one of the biggest challenges for subscription-based businesses. This project analyzes Telco customer data to identify factors that influence churn and predict customers who are most likely to leave.
Using machine learning models (Logistic Regression, Random Forest, XGBoost), I built a predictive pipeline and explained the results with SHAP values to give actionable retention strategies.
Losing customers is costly, acquiring a new customer can be up to 5x more expensive than retaining an existing one.
This project answers two key questions:
- Which customers are most likely to churn?
- What are the key factors driving churn?
├── image/
│ ├── Slide1.JPG
│ ├── Slide2.JPG
│ ├── Slide3.JPG
│ ├── Slide4.JPG
│ ├── Slide5.JPG
│ ├── Slide6.JPG
│ ├── Slide7.JPG
│ ├── Slide8.JPG
│ ├── Slide9.JPG
│ ├── Slide10.JPG
│ ├── Slide11.JPG
│ ├── Slide12.JPG
├── Customer Churn Analysis.ipynb
├── README.md
- Dataset: Telco Customer Churn – Kaggle
- Rows: 7,043
- Columns: 21
- Features: Demographics, account info, service usage, and charges.
- Data Cleaning – Handled missing values, corrected data types, encoded categorical variables.
- Exploratory Data Analysis (EDA) – Visualized churn patterns by tenure, contract type, and monthly charges.
- Modeling – Built and compared Logistic Regression, Random Forest, and XGBoost models.
- Model Interpretation – Used SHAP values to explain model predictions.
- Recommendations – Developed data-driven retention strategies.
| Model | Accuracy | Precision | Recall | F1-score |
|---|---|---|---|---|
| Logistic Regression | 78.68% | 0.619 | 0.513 | 0.561 |
| Random Forest | 78.53% | 0.632 | 0.460 | 0.533 |
| XGBoost | 73.70% | 0.504 | 0.676 | 0.578 |
Best Model:
- For overall balanced performance → Logistic Regression
- For high recall (catching more churners) → XGBoost
- For highest precision → Random Forest
- Contract type is the strongest churn driver — month-to-month customers are far more likely to churn.
- Fiber optic internet service customers churn more than DSL users.
- Tenure is negatively correlated with churn — long-term customers are more loyal.
- Online security & tech support lower churn probability.
- Electronic check payment method is linked to higher churn.
- Promote long-term contracts with incentives.
- Target new customers early (first 6 months) with retention campaigns.
- Offer discounts or bundles to high-bill customers.
- Encourage switching to automatic payment instead of electronic checks.
- Bundle services like online security and tech support to increase customer stickiness.
- Notebook: Customer Churn Analysis
- Slides (JPG): Customer Churn Presentation
- ️ SHAP Summary Plot:
- Python:
pandas,numpy,matplotlib,seaborn,scikit-learn,xgboost,shap - Jupyter Notebook
- Google Slides (presentation)
- GitHub (version control & portfolio hosting)