PyData Project Repository

This project focuses on analyzing ICC Cricket World Cup statistics through data preparation, transformation, analysis, machine learning model deployment, and creating an interactive dashboard.

📁 Case Study

The project is based on the case study: [ICC Cricket World Cup Stats]

Objective

[Description The ICC Cricket World Cup is one of the most highly regarded events in the cricketing world, where teams from around the world compete for glory. This case study shall explore past data from ICC Cricket World Cups to gain insight, find trends, and predict outcomes using this information. By preparing, transforming, analyzing, and visualizing cricket statistics, this project will narrate interesting data-driven stories, exploring at the same time how machine learning models can be used to gain deeper insights.

Problem Statement Cricket generates an enormous amount of data: player performances, match results, team strategies, and historical statistics. However, much of this data exists in raw and unstructured forms, making it difficult to garner actionable insights from it.

The major problems we try to solve are the following:

Data Preparation Challenges: Cleaning, merging, and transforming raw datasets into a state ready for analysis. Identifying Key Insights: Analyzing player and team performance metrics to uncover trends and outliers. Model Deployment: To identify a reliable Hugging Face model that can be used in making predictions or classifying certain aspects of the cricket statistics. Visualization and Storytelling: Develop an interactive dashboard to effectively communicate findings and let the user explore them. Objective The goal will be to develop an intensive yet user-friendly platform, integrating statistical analysis, machine learning, and visualization for insight into the performances of the ICC Cricket World Cup. This platform will:

Allow for the efficient processing and analysis of data. Predict trends or outcomes using AI/ML models. Visualize data interactively, enhancing understanding and engagement for cricket enthusiasts and analysts.]

🛠️ Project Structure

|-- data/                  # Folder containing datasets
|-- notebooks/
|   |-- Task_2.ipynb       # Data preparation and analysis notebook
|   |-- Task_3.ipynb       # NLP model notebook
|-- dashboard/
|   |-- app.py             # Plotly Dash dashboard script
|-- README.md              # Project documentation
|-- requirements.txt       # Python dependencies

🧑‍💻 Tasks

Task 1: Git Basics

Repository maintained with proper branch management.
Meaningful commits and conflict-free main branch.
Includes:
- At least 2 commits per member.
- One completed pull request per branch.

Task 2: Data Preparation and Analysis

Notebook: Task_2.ipynb
Key Steps:
- Data cleaning (removal of duplicates and handling missing values).
- Data transformation (pivoting and grouping).
- Insights and explanations documented using Markdown.

Task 3: NLP with Hugging Face

Notebook: Task_3.ipynb
Objective:
- Deploy a suitable Hugging Face model for [insert NLP task: e.g., sentiment analysis].
- Validate the model's reliability and relevance to the dataset.

Task 4: Visualization Dashboard

Script: dashboard/app.py
Features:
- At least 5 chart types.
- Interactive filters for dynamic exploration of data.
- Clear storytelling through visualizations.

🛠️ Setup Instructions

1. Clone the Repository

git clone https://github.com/[your-username]/[repo-name].git
cd [repo-name]

2. Install Dependencies

pip install -r requirements.txt

3. Run the Dashboard

Navigate to the dashboard/ directory and execute:

python app.py

Access the dashboard at http://localhost:8050.

📊 Deliverables

Public GitHub Repository - Link: Repository Link
Screen Recording - A demonstration of the dashboard's functionality.
PowerPoint Presentation - A 5-slide summary of the project.

🗂️ Datasets

The datasets for this project are located in the data/ folder.
Data was sourced from [insert source, if applicable].

🧑‍🤝‍🧑 Team Members

Senidu ravihara - dashboard
Tharidu thilakarathna - dashboard
Tharidu Nimsara - NLP model
Kavishka venuka - NLP model
dulan jeewantha - Data preparation handle outilers and missing values
amantha sandun - remove any duplicate
piyumi madushika - editing readme file and remove null records
W.P sudasun - Adding new coloumns and to dataframe
ravidu yehan - concaternate files

📝 License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyData Project Repository

📁 Case Study

Objective

🛠️ Project Structure

🧑‍💻 Tasks

Task 1: Git Basics

Task 2: Data Preparation and Analysis

Task 3: NLP with Hugging Face

Task 4: Visualization Dashboard

🛠️ Setup Instructions

1. Clone the Repository

2. Install Dependencies

3. Run the Dashboard

📊 Deliverables

🗂️ Datasets

🧑‍🤝‍🧑 Team Members

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 9

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
dashbord		dashbord
data		data
notebooks		notebooks
README.md		README.md
requirements.txt		requirements.txt

SeniduRavihara/PyData_Assessment

Folders and files

Latest commit

History

Repository files navigation

PyData Project Repository

📁 Case Study

Objective

🛠️ Project Structure

🧑‍💻 Tasks

Task 1: Git Basics

Task 2: Data Preparation and Analysis

Task 3: NLP with Hugging Face

Task 4: Visualization Dashboard

🛠️ Setup Instructions

1. Clone the Repository

2. Install Dependencies

3. Run the Dashboard

📊 Deliverables

🗂️ Datasets

🧑‍🤝‍🧑 Team Members

📝 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 9

Uh oh!

Languages

Packages