This project provides a comprehensive data-driven analysis of a global sales dataset (Classic Cars, Planes, Ships). It demonstrates a full data pipeline-from cleaning and normalization to advanced SQL querying and predictive business insights.
Dataset: Sample Sales Data (Kaggle) - 2,823 transactions covering the period 2003-2005.
If the interactive charts do not load, you can view the static preview above.
| Tool | Application |
|---|---|
| Python | Data cleaning (Pandas), Visualization (Plotly, Matplotlib, Seaborn) |
| SQL (SQLite) | Relational database modeling, CTEs, Window Functions, Complex Joins |
| Analytics | RFM Segmentation, Cohort Analysis, Sales Forecasting |
- Relational Modeling: Normalized raw data into a structured SQLite database (
Products,Customers,Orders). - Data Integrity: Cleaned missing values and standardized categorical fields for accurate reporting.
- Market Dominance: Identified Classic Cars as the primary revenue driver, particularly in the USA, Spain, and France.
- Seasonality: Detected significant sales peaks in Q4 annually, driven by year-end promotions.
- Top Performers: Isolated "Euro Shopping Channel" as the lead customer ($912K+ Revenue).
- Customer Behavior: Performed RFM (Recency, Frequency, Monetary) segmentation to categorize loyal vs. at-risk customers.
- Retention: Created a Cohort Analysis Heatmap to track customer lifecycles and repeat purchase rates.
- 📝
Sales_Dashboard.ipynb→ Full Python code, data cleaning, and visualizations. - 📊
sales_data_cleaned.csv→ The processed and cleaned dataset used for analysis. - 🗄️
sales_database.db→ Normalized relational database file. - 🖼️
monthly_sales_trend.png→ Sales performance over time. - 🖼️
cohort_analysis_heatmap.png→ Customer retention visualization. - 🖼️
dashboard.png→ Screenshot of the final interactive dashboard.
- Clone this repository to your local machine.
- Open
Sales_Dashboard.ipynbin Google Colab or Jupyter Notebook. - Ensure the
sales_data_cleaned.csvfile is in the same directory. - Run all cells sequentially to generate the interactive visualizations.