Customer Churn Prediction Model

A machine learning project for predicting customer churn in banking using multiple classification algorithms with exploratory data analysis and model comparison.

Project Overview

This project implements and compares five classification algorithms to predict whether a bank customer will churn. The analysis includes exploratory data analysis, data preprocessing, model training, and evaluation metrics.

Dataset: 10,000 bank customers with 14 features Target: Customer churn (Exited: 1 = churned, 0 = retained)

Dataset Features

CreditScore: Customer's credit score
Geography: Country (France, Germany, Spain)
Gender: Customer gender
Age: Customer's age
Tenure: Years as customer
Balance: Account balance
NumOfProducts: Number of products owned
HasCrCard: Has credit card (1=Yes, 0=No)
IsActiveMember: Is active member (1=Yes, 0=No)
EstimatedSalary: Annual estimated salary
Exited: Target variable (1=Churned, 0=Retained)

Data Processing

Removed non-predictive columns (RowNumber, CustomerId, Surname)
Encoded categorical variables (Geography, Gender) using LabelEncoder
Normalized numerical features using MinMaxScaler to [0, 1]
Split data: 80% training (8,000), 20% testing (2,000)

Models and Performance

Model	Accuracy	Precision	Recall	F1-Score
K-Nearest Neighbors	82.40%	59.53%	32.57%	0.421
Decision Tree	77.95%	44.67%	51.15%	0.477
Random Forest	86.65%	76.25%	46.56%	0.578
Support Vector Machine	85.30%	82.78%	31.81%	0.460
Naive Bayes	82.85%	68.12%	23.92%	0.354

Best performing model: Random Forest with 86.65% accuracy and 0.578 F1-Score.

Technologies Used

Python 3
pandas, numpy
scikit-learn
seaborn, matplotlib

Running the Project

Open the Jupyter notebook:

jupyter notebook identify_customers_churn.ipynb

Run all cells to execute the full pipeline from data loading through model evaluation

Results

Random Forest achieved the best overall performance with the highest accuracy and balanced precision-recall metrics. The model demonstrated superior ability to identify potential churners while maintaining reasonable precision.

Project Structure

Identify-Customer-Churn/
├── identify_customers_churn.ipynb
├── README.md
└── data/
    └── house-prices-advanced-regression-techniques/

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data/house-prices-advanced-regression-techniques		data/house-prices-advanced-regression-techniques
README.md		README.md
identify_customers_churn.ipynb		identify_customers_churn.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Churn Prediction Model

Project Overview

Dataset Features

Data Processing

Models and Performance

Technologies Used

Running the Project

Results

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Customer Churn Prediction Model

Project Overview

Dataset Features

Data Processing

Models and Performance

Technologies Used

Running the Project

Results

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages