SC1015 Project

About

This is a Mini-Project for the course SC1015 (Introduction to Data Science and Artificial Intelligence). In this project we are using the Wine Quality Data Set for our analysis.

Below is the step-by-step process of the analysis:

Problem Statement

How do we know of a wine is good?
Can we predict if a wine is good based on its attributes?

Conclusion

A good quality wine should have a quality rating >= 7
Individual attributes cannot be used to predict the quality of wine due to low correlation
Models created using random oversampled and SMOTETomek resampled train data performed the best when evaluated with test data
Both models are able to predict the quality of wine correctly with an accuracy of 87%
Among the good quality wines, 67% will be predicted to be good (Sensitivity)
70% of the wines classifed as good are guaranteed to be good (Precision)
Any changes in these attributes (volatile acidity, total sulfur dioxide, residual sugar and chlorides) will have a greater impact on the quality of wine than other attributes

What We have Learnt

Using one hot encoding to transform a catagorial data into a binary data type
Handling imbalanced data with resampling techniques (random oversampling, SMOTEENN resampling and SMOTETomek resampling)
Classification using machine learning (XGBoost)
Evaluating machine learning models (sensitivity, precision and F1 score)
Use of visualisation tools (matplotlib and seaborn)

Contributors

@aish-1509 - Data Extraction, Data Visualisation
@JJChong77 - Data Cleaning, Data Resampling
@JonasChua - Gradient Boosting Classification, Data Visualisation

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
datasets		datasets
1_mlp.ipynb		1_mlp.ipynb
Data_Cleaning_and_Resampling.ipynb		Data_Cleaning_and_Resampling.ipynb
Data_Extraction_and_Visualisation.ipynb		Data_Extraction_and_Visualisation.ipynb
Gradient_Boosting_Classification.ipynb		Gradient_Boosting_Classification.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SC1015 Project

About

Problem Statement

Conclusion

What We have Learnt

Contributors

References

About

Uh oh!

Releases

Packages

Languages

aish-1509/SC1015_Project

Folders and files

Latest commit

History

Repository files navigation

SC1015 Project

About

Problem Statement

Conclusion

What We have Learnt

Contributors

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages