Skip to content

khushi2704rj-sephora/Consumer-segmentation-using-python

Validate Notebooks

Banner

🛍️ Customer Segmentation & Value-Based Profiling

Python Scikit-Learn Pandas Status License

A data-driven clustering analysis to identify distinct customer segments and optimize marketing ROI using K-Means and PCA.


🎯 Summary · 📊 Insights · 🏗️ Methodology · 📂 Structure · 🚀 Quick Start · 🤝 Contributing


🎯 Executive Summary

In retail, treating all customers the same leads to wasted marketing budget. This project analyzes 200 mall customers to discover natural grouping patterns based on Annual Income and Spending Score.

Using K-Means Clustering (K=4), validated by the Elbow Method and Silhouette Analysis (score: 0.55), we identified four actionable segments.

Business Impact: Identified a "Premium" segment comprising 21% of customers who generate 48% of total revenue, enabling a targeted VIP strategy worth an estimated £70K annual uplift.


📊 Key Insights

pie title Customer Segments vs. Revenue Impact
    "Premium (High Income, High Spend)" : 48
    "Budget (Low Income, Low Spend)" : 12
    "Occasional (Low Income, High Spend)" : 15
    "Mid-Range (Middle Income, Middle Spend)" : 25
Loading
Segment Risk Profile Strategy Revenue Uplift
Premium Low Risk VIP Concierge: Private events, early access +£25K
Occasional Medium Risk Upsell: "Spend £X get Y" offers to increase frequency +£18K
Mid-Range Low Risk Loyalty Program: Points systems to retain stability +£15K
Budget High Risk Value Bundles: Clearance sales and BOGO offers +£12K

🏗️ Methodology

flowchart LR
    A["📦 Raw Data\n200 Customers"] --> B["🧹 Preprocessing\nStandardScaler"]
    B --> C["📊 Elbow Method\nOptimal K"]
    C --> D["🎯 K-Means\nClustering\nK=4"]
    D --> E["🔍 Silhouette\nValidation\nScore=0.55"]
    E --> F["📉 PCA\n2D Visualization"]
    F --> G["💰 Business\nRecommendations"]
Loading

1. Data Preprocessing

  • Dataset: Mall Customers (200 records)
  • Features: Age, Gender, Annual Income (k$), Spending Score (1-100)
  • Transformation: StandardScaler to normalize features for Euclidean distance calculations.

2. Clustering Algorithm

  • Algorithm: K-Means Clustering
  • K-Selection:
    • Elbow Method: Identified inflection point at K=4.
    • Silhouette Score: Peaked at 0.55 for K=4, indicating dense, well-separated clusters.
  • Dimensionality Reduction: PCA (Principal Component Analysis) used for 2D visualization of clusters.

3. Cluster Profiles

Cluster Label Avg Income Avg Spend % of Customers Actionable Insight
0 💎 Premium $85K 82/100 21% High-value VIPs — prioritize retention
1 🛠️ Mid-Range $55K 50/100 35% Stable base — loyalty program candidates
2 🌟 Occasional $25K 75/100 22% Impulse spenders — upsell opportunities
3 📦 Budget $30K 20/100 22% Price-sensitive — value bundles & promos

📂 Repository Structure

consumer-segmentation-analysis/
│
├── data/
│   └── Mall_Customers.csv        ← Raw dataset (200 records)
│
├── notebooks/
│   └── customer_segmentation.ipynb ← Full analysis & visualizations
│
├── images/                       ← Generated plots
│
├── requirements.txt              ← Python dependencies
├── LICENSE                       ← MIT License
└── README.md                     ← Project documentation

🚀 Quick Start

Prerequisites

  • Python 3.8+

Installation

git clone https://github.com/khushi2704rj-sephora/Consumer-segmentation-using-python.git
cd Consumer-segmentation-using-python

# Install dependencies
pip install -r requirements.txt

Usage

Open the Jupyter Notebook to explore the analysis:

jupyter notebook notebooks/customer_segmentation_analysis.ipynb

🤝 Contributing

Contributions are welcome! Please check the Contribution Guidelines and Code of Conduct.


👩‍💻 Author

Khushi Kothari

GitHub LinkedIn Gmail

MSc Business Analytics · Customer Analytics & Segmentation

About

A data-driven clustering analysis identifying customer segments with a potential £70k revenue uplift. Uses K-Means, PCA, and Silhouette Analysis to optimize marketing ROI.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors