Market Basket Analysis (BI Data Analysis Project)

Project Overview

This project analyzes the Instacart dataset to uncover patterns in customer purchasing behavior, product reorders, and basket composition. Using the relational data from multiple tables, the project demonstrates advanced data cleaning, feature engineering, exploratory data analysis, and predictive modeling.

Dashboard example
Apriori algorithm to find patterns for the product bananas (what is frequently bought together)

Dataset

The dataset is sourced from

These are large files therefore you must download the raw data and place the csv files into the /data/raw folder directory.
You will also have several prepared Power BI-ready tables summarizing product demand, user behavior, and department performance for interactive analytics exported inside the 'data' folder.

Table	Description
orders.csv	Customer orders, timestamps, and user info
order_products__prior.csv	Products purchased in prior orders
order_products__train.csv	Products purchased in train set orders (for reorder prediction)
products.csv	Product IDs, names, aisle and department IDs
aisles.csv	Aisle names and IDs
departments.csv	Department names and IDs

Project Goals

Clean and merge multiple relational tables to create an integrated dataset.
Conduct Exploratory Data Analysis (EDA) to understand customer purchasing habits.
Generate insights into product reorders,basket composition, and departmental trends.
Build a predictive model for product reordering behavior.
Develop interactive dashboards for visualization and decision support (Power BI/Tableau).

Tools & Technologies

Python (pandas, numpy, matplotlib, seaborn, scikit-learn)
SQL (joins, aggregations, feature engineering)
Jupyter Notebook for interactive exploration
Power BI (or Tableau) for dashboards

Key Features / Deliverables

Data cleaning scripts for handling missing values, duplicates, and inconsistent timestamps.
Merged and enriched datasets ready for analysis.
Correlation matrices and summary statistics for key features.
Predictive modeling to identify reorder likelihood for products.
Interactive dashboards highlighting product, aisle, and department trends over time.

Project Structure

├── data/
│ ├── raw/          # Original Instacart CSV files downloaded here (not tracked/large file size)
│ └── clean/        # Cleaned and merged dataset (not tracked/large file size)
├── images/         # Screenshots
├── notebooks/      # Jupyter notebooks for analysis
├── scripts/        # Python scripts for ETL and modeling
├── reports/        # Charts, plots, and dashboard
└── README.md       # Project documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Market Basket Analysis (BI Data Analysis Project)

Project Overview

Dataset

Project Goals

Tools & Technologies

Key Features / Deliverables

Project Structure

Featured Report Images

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
images		images
notebooks		notebooks
scripts		scripts
.gitattributes		.gitattributes
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Market Basket Analysis (BI Data Analysis Project)

Project Overview

Dataset

Project Goals

Tools & Technologies

Key Features / Deliverables

Project Structure

Featured Report Images

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages