Skip to content

jagrat2004/Neural-Retention-AI

Repository files navigation

ANN Classification & Regression Project

This project demonstrates the use of Artificial Neural Networks (ANNs) for both classification and regression tasks using TensorFlow/Keras. The primary focus is on customer churn prediction with an interactive Streamlit web application.

πŸ“‹ Project Overview

This is a comprehensive deep learning project that includes:

  • Customer Churn Classification: Predict whether a customer will leave a bank
  • Salary Regression: Predict estimated salary based on customer features
  • Hyperparameter Tuning: Optimize ANN architecture for best performance
  • Interactive Web Application: Streamlit-based UI for real-time predictions

πŸ“ Project Structure

Section-53-annclassification/annclassification/
β”œβ”€β”€ app.py                              # Streamlit web application for churn prediction
β”œβ”€β”€ Churn_Modelling.csv                 # Main dataset
β”œβ”€β”€ requirements.txt                    # Python dependencies
β”œβ”€β”€ model.h5                            # Trained ANN model for churn classification
β”œβ”€β”€ regression_model.h5                 # Trained ANN model for salary regression
β”œβ”€β”€ label_encoder_gender.pkl            # Saved encoder for gender feature
β”œβ”€β”€ onehot_encoder_geo.pkl              # Saved encoder for geography feature
β”œβ”€β”€ scaler.pkl                          # Saved StandardScaler for feature normalization
β”œβ”€β”€ experiments.ipynb                   # Data exploration and preprocessing notebook
β”œβ”€β”€ hyperparametertuningann.ipynb       # Hyperparameter tuning and model optimization
β”œβ”€β”€ prediction.ipynb                    # Churn prediction examples
β”œβ”€β”€ salaryregression.ipynb              # Salary prediction model development
β”œβ”€β”€ logs/                               # TensorBoard logs for churn model training
β”‚   └── fit/20260203-094807/
β”‚       β”œβ”€β”€ train/
β”‚       └── validation/
└── regressionlogs/                     # TensorBoard logs for regression model training
    └── fit/20260204-221255/
        β”œβ”€β”€ train/
        └── validation/

πŸ“Š Dataset

File: Churn_Modelling.csv

The dataset contains customer banking information with the following features:

  • CreditScore: Customer's credit score
  • Geography: Customer's country (France, Germany, Spain)
  • Gender: Customer's gender (Male/Female)
  • Age: Customer's age in years
  • Tenure: Years of customer relationship with the bank
  • Balance: Account balance
  • NumOfProducts: Number of products the customer uses
  • HasCrCard: Whether customer has a credit card (0/1)
  • IsActiveMember: Whether customer is active (0/1)
  • EstimatedSalary: Estimated annual salary
  • Exited: Target variable - whether customer churned (0/1)

πŸš€ Quick Start

Installation

  1. Install required dependencies:
pip install -r requirements.txt

Running the Web Application

Launch the Streamlit app for real-time churn predictions:

streamlit run app.py

The app will open in your browser at http://localhost:8501 with an interactive interface to predict customer churn.

πŸ“š Notebooks Overview

1. experiments.ipynb

Initial data exploration and preprocessing notebook.

  • Loads the Churn_Modelling.csv dataset
  • Data cleaning (removes non-feature columns: RowNumber, CustomerId, Surname)
  • Label encoding for categorical gender variable
  • One-hot encoding for geography feature
  • Basic data analysis and visualization

2. hyperparametertuningann.ipynb

Focuses on finding optimal ANN architecture and hyperparameters.

Key Steps:

  • Data loading and preprocessing (same as experiments.ipynb)
  • Train-test split (80-20 with random_state=42)
  • Feature scaling using StandardScaler
  • ANN model creation with configurable hidden layers
  • Grid search/Random search for optimal hyperparameters
  • Model training with early stopping
  • Model evaluation and performance metrics
  • Saves trained model as model.h5

Architecture Guidelines:

  • Start with 1-2 hidden layers
  • Hidden neurons: between input and output layer size
  • Uses EarlyStopping callback to prevent overfitting
  • Saves encoders and scaler for deployment

3. prediction.ipynb

Demonstrates how to use the trained churn classification model.

Process:

  • Loads pre-trained model and preprocessors (model.h5, scaler, encoders)
  • Creates sample input data with customer features
  • Encodes categorical variables (Gender, Geography)
  • Scales features using saved StandardScaler
  • Makes churn predictions with probability scores
  • Interprets predictions (>0.5 = likely to churn, <0.5 = unlikely)

4. salaryregression.ipynb

Develops an ANN model for salary prediction (regression task).

Process:

  • Loads and preprocesses Churn_Modelling.csv data
  • Drops irrelevant columns and encodes categorical features
  • Splits data into training and testing sets
  • Scales features using StandardScaler
  • Creates ANN regression model architecture
  • Trains model with appropriate loss function (MSE for regression)
  • Evaluates performance using regression metrics
  • Saves trained regression model as regression_model.h5

πŸ€– Streamlit Web Application (app.py)

The interactive web application provides a user-friendly interface for churn prediction.

Features:

  • Input Controls:
    • Geography dropdown (from training data)
    • Gender dropdown (Male/Female)
    • Age slider (18-92 years)
    • Credit Score input field
    • Balance input field
    • Estimated Salary input field
    • Tenure slider (0-10 years)
    • Number of Products slider (1-4)
    • Credit Card status (Yes/No)
    • Active Member status (Yes/No)

Prediction Output:

  • Displays churn probability (0-1 scale)
  • Provides interpretation: "likely to churn" if probability > 0.5
  • Clear, intuitive user experience

πŸ“¦ Dependencies

Package Version Purpose
TensorFlow 2.15.0 Deep learning framework
Keras Latest Neural network API (included in TensorFlow)
Pandas Latest Data manipulation and analysis
NumPy Latest Numerical computations
scikit-learn Latest Preprocessing and model evaluation
Streamlit Latest Web application framework
TensorBoard Latest Model training visualization
scikit-learn-keras Latest Scikit-learn wrapper for Keras
Matplotlib Latest Data visualization

πŸ”§ Model Files

File Purpose Size
model.h5 Trained ANN classifier for churn prediction Binary format
regression_model.h5 Trained ANN regressor for salary prediction Binary format
label_encoder_gender.pkl LabelEncoder for gender (Male/Female β†’ 0/1) Pickle format
onehot_encoder_geo.pkl OneHotEncoder for geography (3 countries β†’ 3 features) Pickle format
scaler.pkl StandardScaler for feature normalization Pickle format

πŸ“ˆ Model Training & Evaluation

Churn Classification Model

  • Input Features: 11 (after preprocessing)
  • Output: Binary (churned or not)
  • Activation: ReLU for hidden layers, Sigmoid for output
  • Loss Function: Binary Crossentropy
  • Optimizer: Adam
  • Metrics: Accuracy, Precision, Recall, AUC

Salary Regression Model

  • Input Features: 11 (after preprocessing)
  • Output: Continuous value (salary)
  • Activation: ReLU for hidden layers, Linear for output
  • Loss Function: Mean Squared Error (MSE)
  • Optimizer: Adam
  • Metrics: MAE, RMSE, R-squared

πŸ“Š TensorBoard Logs

Two sets of training logs are available for visualization:

Churn Model (logs/fit/20260203-094807/):

  • Training and validation metrics from model optimization

Regression Model (regressionlogs/fit/20260204-221255/):

  • Training and validation metrics for salary prediction

View logs using:

tensorboard --logdir=logs/fit
# or
tensorboard --logdir=regressionlogs/fit

πŸ” Data Preprocessing Pipeline

  1. Data Cleaning: Remove non-feature columns (RowNumber, CustomerId, Surname)
  2. Categorical Encoding:
    • Gender: LabelEncoder (Male=1, Female=0)
    • Geography: OneHotEncoder (creates 3 binary features for 3 countries)
  3. Feature Scaling: StandardScaler (mean=0, std=1) applied to all numerical features
  4. Train-Test Split: 80% training, 20% testing with fixed random seed

πŸ’Ύ Saving and Loading Models

Save a model:

model.save('model_name.h5')

Load a model:

model = tf.keras.models.load_model('model_name.h5')

Save encoders and scalers:

with open('encoder_name.pkl', 'wb') as file:
    pickle.dump(encoder, file)

Load encoders and scalers:

with open('encoder_name.pkl', 'rb') as file:
    encoder = pickle.load(file)

🎯 Use Cases

  1. Customer Retention: Identify at-risk customers and implement retention strategies
  2. Resource Planning: Allocate customer service resources based on churn risk
  3. Salary Negotiation: Estimate salary ranges for compensation planning
  4. Risk Assessment: Evaluate customer lifetime value and engagement

πŸ“ Notes

  • All models use consistent preprocessing (same encoders and scaler)
  • Random seed (42) ensures reproducibility
  • Early stopping prevents overfitting during training
  • Predictions require properly encoded and scaled inputs
  • The Streamlit app handles all preprocessing automatically

πŸ”— Further Improvements

  • Implement cross-validation for more robust evaluation
  • Experiment with different architectures (deeper networks, dropout layers)
  • Add feature importance analysis
  • Implement SHAP values for model interpretability
  • Add batch prediction capability to the web app
  • Create visualization dashboards in Streamlit

πŸ“„ License

This project is a part of a deep learning .


Last Updated: February 5, 2026
Project Type: Deep Learning Classification & Regression
Framework: TensorFlow/Keras
Interface: Streamlit Web App

About

This project is an end-to-end deep learning solution built using TensorFlow and Keras that demonstrates the application of Artificial Neural Networks (ANNs) for both classification and regression on structured banking data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors