This project develops a machine-learning–driven churn prediction solution for grocery e-retailers. It analyses customer behaviour, experience indicators, complaints, payment preferences, and tenure patterns to identify early signals of churn and generate data-backed recommendations for improving customer retention.
The goal is to help e-retail organisations understand why customers leave, which customers are at highest risk, and what interventions can increase loyalty and lifetime value.
The project uses the E-commerce Customer Churn Analysis and Prediction dataset published by Ankit Verma (2021), Product at Ufaber Edutech Pvt Ltd, available on Kaggle: https://www.kaggle.com/datasets/ankitverma2010/ecommerce-customer-churn-analysis-and-prediction
The dataset contains key behavioural and demographic attributes including satisfaction scores, complaint history, payment methods, and customer tenure — all core predictors of churn in digital retail.
Analysis of the dataset revealed the following behavioural and experience-driven insights:
Satisfaction Levels: Approximately 30% of customers rated satisfaction at 3/5, indicating moderate to low satisfaction — a strong churn warning signal.
Payment Preference: 2,314 customers (41% of the dataset) preferred debit card payments, highlighting a dominant transactional behaviour that retailers can target for personalised incentives.
Customer Tenure: New customers were far more likely to churn than long-standing customers. Longer tenure correlated strongly with lower churn probability.
Complaint Behaviour: Customers who filed complaints demonstrated significantly higher churn rates, reinforcing the importance of quick and effective customer service resolution.
These findings show that churn risk can be anticipated by monitoring satisfaction decline, complaint frequency, and early-tenure behaviour.
The analytical workflow combines:
Python — for data preprocessing, feature engineering, model development, and evaluation
SQL — for structured querying and behavioural trend analysis
Power BI — for visual analytics, churn drivers, and segment-level insights
This reflects a standard multi-tool pipeline used in industry data science teams.
Model compariason chart
The best performing Churn prediction model is the Random Forest (200 Trees). This is illustrated comparing other models in the figure above
Personalised voice-assisted shopping
Customised social-commerce experiences
Emotion-aware product recommendations
Personalised gifting suggestions
Immersive AR/VR shopping experiences
Loyalty and reward-driven retention programmes
These strategies align with modern retail personalisation standards and help retailers convert churn insights into meaningful action.
This work demonstrates how churn prediction can be integrated into customer-centric operations to significantly improve retention outcomes in grocery e-retailing.
Key recommendations include:
Retailers should replace generic retention strategies with data-driven, personalised interventions tailored to customer needs and behavioural patterns.
Organisations should monitor and react to churn indicators such as declining satisfaction, complaints, and reduced engagement.
The proposed retention framework provides a practical, scalable way for e-retailers to embed churn prediction into CRM systems, customer engagement platforms, and loyalty initiatives.
By operationalising this model, grocery e-retailers can proactively address churn drivers, improve customer lifetime value, and strengthen competitive advantage in the digital marketplace.