This repository serves as a collection of projects focused on data mining techniques, including data exploration, pattern recognition, and visualization. Each project applies different methodologies and tools to extract insights from various datasets. These projects are from assignments for the AD699 Data Mining course.
- Description: Analyzing service requests data from Vancouver to identify trends and patterns.
- Dataset Source: Vancouver’s Open Data Portal
- Technologies Used: R (tidyverse, dplyr, lubridate, ggplot2, ggthemes, leaflet), R Notebook
- Project Link: Vancouver City Requests Analysis
- Description: Forecasting hockey player salaries by utilizing statistical analysis and machine learning methods.
- Technologies Used: R (tidyverse, dplyr, visualize, ggplot2, gplots, forecast), R Notebook
- Project Link: Hockey players salary prediction
- Description: Classification using k-NN algorithm
- Technologies Used: R (tidyverse, dplyr, caret, fnn, ggplot2), R Notebook
- Project Link: Spotify k-nn classifier
- Description: Utilized the Naïve Bayes algorithm to classify and predict consumer disputes based on complaint data
- Technologies Used: R (tidyverse, dplyr, e1071, ggplot2), R Notebook
- Project Link: Consumer complaints analysis
More projects will be added over time, covering different aspects of data mining such as clustering, classification, and anomaly detection.