🛒 Retail Sales Data Analytics & Business Insights System

A end-to-end data analytics project that cleans, analyzes, and visualizes 50,000+ retail transaction records — featuring SQL-based KPI extraction and an interactive Streamlit dashboard.

📌 Project Overview

This project simulates a real-world retail analytics pipeline, taking raw superstore sales data from ingestion all the way to an executive-ready dashboard. It covers data cleaning, exploratory data analysis, SQL querying, and interactive visualization.

🗂️ Project Structure

Retail-Data-Analytics/
│
├── data/                    # Raw and cleaned CSV + SQLite database
├── outputs/
│   ├── charts/              # Saved EDA charts
│   └── executive_summary.txt
├── src/
│   ├── data_cleaning.py     # Data ingestion, cleaning, feature engineering
│   ├── eda.py               # Exploratory data analysis + chart generation
│   ├── sql_analysis.py      # SQLite setup + KPI queries
│   └── insights.py          # Automated executive summary generation
├── dashboard/
│   └── app.py               # Streamlit interactive dashboard
├── main.py                  # Run full pipeline in one command
└── requirements.txt

⚙️ Tech Stack

Tool	Purpose
Python / Pandas	Data cleaning & EDA
SQLite / SQLAlchemy	SQL-based KPI extraction
Matplotlib / Seaborn	Static chart generation
Plotly	Interactive dashboard charts
Streamlit	Web-based dashboard UI

🚀 Getting Started

1. Clone the repository

git clone https://github.com/your-username/Retail-Data-Analytics.git
cd Retail-Data-Analytics

2. Install dependencies

pip install -r requirements.txt

3. Add the dataset

Download the Superstore Sales dataset from Kaggle and place the CSV at:

data/Superstore.csv

4. Run the full pipeline

python main.py

5. Launch the dashboard

streamlit run dashboard/app.py

📊 Key Features

Data Cleaning — Fixed date formats, removed duplicates, engineered features like profit_margin and ship_days
EDA — Monthly revenue trends, category/regional breakdowns, discount vs. profit scatter analysis, top customer rankings
SQL Queries — KPIs extracted using GROUP BY, subqueries, COUNT(DISTINCT), and conditional aggregations
Streamlit Dashboard — Fully interactive with sidebar filters by year, region, and category
Executive Summary — Auto-generated plain-English business insights saved to file

📷 Dashboard Preview

📁 Dataset

Source: Superstore Sales — Kaggle
Size: 9,000+ transaction records
Features: Order dates, customer segments, product categories, regional data, sales, profit, and discount

👤 Author

Nilay Srivastava
GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛒 Retail Sales Data Analytics & Business Insights System

📌 Project Overview

🗂️ Project Structure

⚙️ Tech Stack

🚀 Getting Started

📊 Key Features

📷 Dashboard Preview

📁 Dataset

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dashboard		dashboard
data		data
outputs		outputs
src		src
.DS_Store		.DS_Store
README.md		README.md
image.png		image.png
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🛒 Retail Sales Data Analytics & Business Insights System

📌 Project Overview

🗂️ Project Structure

⚙️ Tech Stack

🚀 Getting Started

📊 Key Features

📷 Dashboard Preview

📁 Dataset

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages