Skip to content

ariesiitr/Credit-Card-Risk-Prediction

Repository files navigation

Credit Card Behaviour Score Prediction Using Classification and Risk-Based Techniques

Overview

This project aims to enhance the credit risk management system of a bank by predicting the likelihood of a credit card customer defaulting on their payment in the following month. Using anonymized historical behavioral data from over 30,000 customers, we build a binary classification model that flags potential defaulters early, enabling proactive risk mitigation strategies.

Objective

  • Build a classification model to predict next_month_default (1 = Default, 0 = No Default).
  • Perform exploratory and financial analysis to identify behavioral patterns associated with credit default.
  • Handle class imbalance using techniques like SMOTE, class weights, or undersampling.
  • Engineer meaningful features (e.g., utilization ratio, delinquency streaks).
  • Evaluate models using risk-sensitive metrics such as F1, F2, and ROC-AUC.
  • Generate production-ready predictions on an unlabeled validation set.

Dataset Description

  • Training Data: ~25,000 records with features like:
    • LIMIT_BAL: Credit limit
    • sex, age, education, marriage: Demographic info
    • PAY_0 to PAY_6: Repayment status history
    • BILL_AMT1 to BILL_AMT6: Monthly billed amounts
    • PAY_AMT1 to PAY_AMT6: Monthly payments
    • next_month_default: Target variable
  • Validation Data: ~5,000 records with the same features but no labels.

Key Features Engineered

  • AVG_Bill_amt: Average of all billed amounts over 6 months
  • PAY_TO_BILL_ratio: Total payment divided by total billed amount
  • Delinquency Streak: Consecutive months of overdue payments
  • Utilization Ratio: Ratio of total bill amount to credit limit

Modeling Techniques

We compared multiple models to identify the best-performing one:

  • Logistic Regression
  • Decision Tree Classifier
  • XGBoost
  • LightGBM

Evaluation Strategy

Since real-world credit risk decisions prioritize recall over precision, metrics were chosen accordingly:

  • Primary Metrics: F1 Score, F2 Score, ROC-AUC
  • Threshold Tuning: The decision threshold was adjusted to balance business implications of false positives (unnecessary alerts) vs. false negatives (missed defaulters).

Business Implications

Accurate prediction of customer defaults helps in:

  • Reducing credit losses
  • Designing early-warning systems
  • Optimizing credit exposure
  • Improving risk-based pricing and customer segmentation

What we achieved:

  • Jupyter Notebook with:
    • Data processing
    • Exploratory data analysis (EDA)
    • Financial insights
    • Feature engineering
    • Model training and evaluation
    • Handling Class Imbalance
    • Final predictions

Tools and Libraries Used

  • Python
  • pandas, numpy
  • matplotlib, seaborn
  • scikit-learn, imbalanced-learn
  • xgboost, lightgbm

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors