CMC-Classification

This project is a part of Data Science I- COSC3337-Group3 (https://github.com/thieny1991/DataScience).

-- Project Status: Active

Project Intro/Objective

The purpose of this project is to apply different classification techniques to a chalange dataset, to compare the result, to potentially enhance the accuracy of the learnt models via selecting better parameters/ preprocessing/ using kernels/ incorportating background knowledge and to summarize our findings in a report. The challenge data set that our group will work on is Contraceptive Method Choice data set (https://archive.ics.uci.edu/ml/datasets/Contraceptive+Method+Choice)

Steps:

1. Data Exploration
2. Data Quality and Preprocessing
3. Neural Networks Classifier
4. SVM Classifier
5. KNN Classifier
6. Random Forest Classifier
7. Comparison
8. Conclusion

Methods Used

Inferential Statistics
Machine Learning
Data Visualization
Neural Networks
Grid Search technique for hyperparameter tuning
Support Vector Machines
KNN
Random Forest

Technologies

Python
Pandas, jupyter,sklearn
Colab
Git

Project Description

This dataset is a subset of the 1987 National Indonesia Contraceptive
Prevalence Survey. The samples are married women who were either not 
pregnant or do not know if they were at the time of interview. The 
problem is to predict the current contraceptive method choice 
(no use, long-term methods, or short-term methods) of a woman based 
on her demographic and socio-economic characteristics.
Based on the given data set, our project will go through all neccessary 
steps to analize the 4 listed classification methods and compare their results 
in order to come up with the best fit classification.

Needs of this project

data exploration
data processing/cleaning
classification
write up/reporting
presentation

Getting Started

Clone this repo (https://github.com/thieny1991/DataScience).
Raw Data is cmc.da within this repo. Data Descritpion is cmc.names
Data processing/transformation scripts are being kept [here](Repo folder containing data processing scripts/notebooks)
Follow setup [instructions](Link to file)

Contributing DS Members

Name	Slack Handle
Y Nguyen	https://github.com/thieny1991
Syed Alam	https://github.com/mubashiralam
GiaiTran	https://github.com/GiaiTran
Thuy Nguyen	https://github.com/milasido

Contribution detail:

Name	Slack Handle
Y Nguyen	data quality, SVM and report,tesing program,data visualization
Syed Alam	Random Forest and report, PPT design, testing program
GiaiTran	Team lead, research methods,PPT, preprocessing data, Neural Network and report
Thuy Nguyen	KNN and report, report design, data visualization, PPT

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.idea		.idea
Assignment_2		Assignment_2
G3_Report.docx		G3_Report.docx
Group 3 Presentation.pptx		Group 3 Presentation.pptx
Presentation Sections.txt		Presentation Sections.txt
README.md		README.md
cmc.names		cmc.names
excelGraph.xlsx		excelGraph.xlsx
report.docx		report.docx
~$Group 3 Presentation.pptx		~$Group 3 Presentation.pptx
~$report.docx		~$report.docx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMC-Classification

-- Project Status: Active

Project Intro/Objective

Steps:

Methods Used

Technologies

Project Description

Needs of this project

Getting Started

Contributing DS Members

Contribution detail:

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

thieny1991/DataScience

Folders and files

Latest commit

History

Repository files navigation

CMC-Classification

-- Project Status: Active

Project Intro/Objective

Steps:

Methods Used

Technologies

Project Description

Needs of this project

Getting Started

Contributing DS Members

Contribution detail:

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages