This project analyzes an eCommerce transactions dataset to derive business insights, build a lookalike model, and perform customer segmentation. The goal is to demonstrate data science capabilities in exploratory data analysis (EDA), recommendation systems, and clustering techniques.
- Perform EDA on the dataset to uncover trends, patterns, and actionable insights.
- Deliverables:
- Python code (Jupyter Notebook)
- PDF report summarizing 5 key business insights.
- Build a recommendation system that suggests 3 similar customers based on profile and transaction history.
- Deliverables:
- CSV file containing lookalike recommendations for the first 20 customers.
- Python code (Jupyter Notebook) for model development.
- Perform clustering to segment customers based on profiles and transactions.
- Deliverables:
- Report with clustering results, including the number of clusters and evaluation metrics (e.g., DB Index).
- Python code (Jupyter Notebook) for clustering.
The project uses three files:
-
Customers.csv:
CustomerID: Unique customer identifier.CustomerName: Name of the customer.Region: Customer's region.SignupDate: Date of signup.
-
Products.csv:
ProductID: Unique product identifier.ProductName: Name of the product.Category: Product category.Price: Product price (USD).
-
Transactions.csv:
TransactionID: Unique transaction identifier.CustomerID: Customer who made the transaction.ProductID: Product involved in the transaction.TransactionDate: Date of transaction.Quantity: Quantity purchased.TotalValue: Total value of the transaction (USD).
├── data/
│ ├── Customers.csv
│ ├── Products.csv
│ ├── Transactions.csv
├── notebooks/
│ ├── FirstName_LastName_EDA.ipynb
│ ├── FirstName_LastName_Lookalike.ipynb
│ ├── FirstName_LastName_Clustering.ipynb
├── reports/
│ ├── FirstName_LastName_EDA.pdf
│ ├── FirstName_LastName_Clustering.pdf
├── results/
│ ├── Lookalike.csv
├── README.md
- Python 3.7+
- Required libraries:
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn
-
Clone the repository:
git clone https://github.com/<username>/ecommerce-analysis.git cd ecommerce-analysis
-
Install required libraries:
pip install -r requirements.txt
-
Upload the datasets to the
data/directory.
- Navigate to the
notebooks/directory. - Open and execute the
FirstName_LastName_EDA.ipynbnotebook to perform EDA and generate insights.
- Open and execute the
FirstName_LastName_Lookalike.ipynbnotebook. - Check the output file
Lookalike.csvin theresults/directory.
- Open and execute the
FirstName_LastName_Clustering.ipynbnotebook. - Review the clustering report in
reports/.
- EDA: Uncovered trends in customer behavior, product sales, and revenue.
- Lookalike Model: Recommended 3 similar customers for each of the first 20 customers.
- Clustering: Segmented customers into distinct groups with detailed metrics.
Kanishkar V kanishvijay2005@gmail.com www.linkedin.com/in/kanishkar-v-3471782a2/
This project is licensed under the MIT License. See the LICENSE file for details.