Skip to content

NilayS08/Retail-Data-Analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛒 Retail Sales Data Analytics & Business Insights System

A end-to-end data analytics project that cleans, analyzes, and visualizes 50,000+ retail transaction records — featuring SQL-based KPI extraction and an interactive Streamlit dashboard.


📌 Project Overview

This project simulates a real-world retail analytics pipeline, taking raw superstore sales data from ingestion all the way to an executive-ready dashboard. It covers data cleaning, exploratory data analysis, SQL querying, and interactive visualization.


🗂️ Project Structure

Retail-Data-Analytics/
│
├── data/                    # Raw and cleaned CSV + SQLite database
├── outputs/
│   ├── charts/              # Saved EDA charts
│   └── executive_summary.txt
├── src/
│   ├── data_cleaning.py     # Data ingestion, cleaning, feature engineering
│   ├── eda.py               # Exploratory data analysis + chart generation
│   ├── sql_analysis.py      # SQLite setup + KPI queries
│   └── insights.py          # Automated executive summary generation
├── dashboard/
│   └── app.py               # Streamlit interactive dashboard
├── main.py                  # Run full pipeline in one command
└── requirements.txt

⚙️ Tech Stack

Tool Purpose
Python / Pandas Data cleaning & EDA
SQLite / SQLAlchemy SQL-based KPI extraction
Matplotlib / Seaborn Static chart generation
Plotly Interactive dashboard charts
Streamlit Web-based dashboard UI

🚀 Getting Started

1. Clone the repository

git clone https://github.com/your-username/Retail-Data-Analytics.git
cd Retail-Data-Analytics

2. Install dependencies

pip install -r requirements.txt

3. Add the dataset

Download the Superstore Sales dataset from Kaggle and place the CSV at:

data/Superstore.csv

4. Run the full pipeline

python main.py

5. Launch the dashboard

streamlit run dashboard/app.py

📊 Key Features

  • Data Cleaning — Fixed date formats, removed duplicates, engineered features like profit_margin and ship_days
  • EDA — Monthly revenue trends, category/regional breakdowns, discount vs. profit scatter analysis, top customer rankings
  • SQL Queries — KPIs extracted using GROUP BY, subqueries, COUNT(DISTINCT), and conditional aggregations
  • Streamlit Dashboard — Fully interactive with sidebar filters by year, region, and category
  • Executive Summary — Auto-generated plain-English business insights saved to file

📷 Dashboard Preview

Dashboard


📁 Dataset

  • Source: Superstore Sales — Kaggle
  • Size: 9,000+ transaction records
  • Features: Order dates, customer segments, product categories, regional data, sales, profit, and discount

👤 Author

Nilay Srivastava
GitHub

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages