This project focuses on analyzing ICC Cricket World Cup statistics through data preparation, transformation, analysis, machine learning model deployment, and creating an interactive dashboard.
The project is based on the case study: [ICC Cricket World Cup Stats]
[Description The ICC Cricket World Cup is one of the most highly regarded events in the cricketing world, where teams from around the world compete for glory. This case study shall explore past data from ICC Cricket World Cups to gain insight, find trends, and predict outcomes using this information. By preparing, transforming, analyzing, and visualizing cricket statistics, this project will narrate interesting data-driven stories, exploring at the same time how machine learning models can be used to gain deeper insights.
Problem Statement Cricket generates an enormous amount of data: player performances, match results, team strategies, and historical statistics. However, much of this data exists in raw and unstructured forms, making it difficult to garner actionable insights from it.
The major problems we try to solve are the following:
Data Preparation Challenges: Cleaning, merging, and transforming raw datasets into a state ready for analysis. Identifying Key Insights: Analyzing player and team performance metrics to uncover trends and outliers. Model Deployment: To identify a reliable Hugging Face model that can be used in making predictions or classifying certain aspects of the cricket statistics. Visualization and Storytelling: Develop an interactive dashboard to effectively communicate findings and let the user explore them. Objective The goal will be to develop an intensive yet user-friendly platform, integrating statistical analysis, machine learning, and visualization for insight into the performances of the ICC Cricket World Cup. This platform will:
Allow for the efficient processing and analysis of data. Predict trends or outcomes using AI/ML models. Visualize data interactively, enhancing understanding and engagement for cricket enthusiasts and analysts.]
|-- data/ # Folder containing datasets
|-- notebooks/
| |-- Task_2.ipynb # Data preparation and analysis notebook
| |-- Task_3.ipynb # NLP model notebook
|-- dashboard/
| |-- app.py # Plotly Dash dashboard script
|-- README.md # Project documentation
|-- requirements.txt # Python dependencies
- Repository maintained with proper branch management.
- Meaningful commits and conflict-free main branch.
- Includes:
- At least 2 commits per member.
- One completed pull request per branch.
- Notebook:
Task_2.ipynb - Key Steps:
- Data cleaning (removal of duplicates and handling missing values).
- Data transformation (pivoting and grouping).
- Insights and explanations documented using Markdown.
- Notebook:
Task_3.ipynb - Objective:
- Deploy a suitable Hugging Face model for [insert NLP task: e.g., sentiment analysis].
- Validate the model's reliability and relevance to the dataset.
- Script:
dashboard/app.py - Features:
- At least 5 chart types.
- Interactive filters for dynamic exploration of data.
- Clear storytelling through visualizations.
git clone https://github.com/[your-username]/[repo-name].git
cd [repo-name]pip install -r requirements.txtNavigate to the dashboard/ directory and execute:
python app.pyAccess the dashboard at http://localhost:8050.
- Public GitHub Repository - Link: Repository Link
- Screen Recording - A demonstration of the dashboard's functionality.
- PowerPoint Presentation - A 5-slide summary of the project.
- The datasets for this project are located in the
data/folder. - Data was sourced from [insert source, if applicable].
- Senidu ravihara - dashboard
- Tharidu thilakarathna - dashboard
- Tharidu Nimsara - NLP model
- Kavishka venuka - NLP model
- dulan jeewantha - Data preparation handle outilers and missing values
- amantha sandun - remove any duplicate
- piyumi madushika - editing readme file and remove null records
- W.P sudasun - Adding new coloumns and to dataframe
- ravidu yehan - concaternate files
This project is licensed under the MIT License.