Machine Learning Assignment - Kaggle-style Competition

Overview

This assignment simulates a Kaggle-style competition with a dataset divided into train, dev, and test sets. The task is to create an accurate classifier, with the top classifiers receiving bonus points. The assignment consists of two main parts:

Part 1: Experimental Analysis (`experiments.ipynb`)

Objective

Conduct preliminary data analysis to build and justify the selected model.

Tasks

Exploration and preliminary data analysis.
Justification for model selection.
Demonstrate preprocessing steps and their impact on model accuracy.
Describe hyperparameter search and its impact on model accuracy.

Part 2: Model Development (`model.py`)

Objective

Build and submit a model using the provided template.

Tasks

Adjust the template according to the chosen model from Part 1.
Set base hyperparameters for the model.
Allowed to add functions but must be confined to a single file submission.

Solution Overview

The solution begins with importing necessary libraries and loading the dataset.
Preliminary data analysis reveals an unbalanced dataset.
LazyPredict is utilized to select the best algorithms for further adjustmemt.
The dataset is balanced using over-sampling and under-sampling.
Feature selection and data transformation techniques are applied.
Various models are tested, compared, and adjusted using different parameters and features.
Final models are validated against the 'dev' set to evaluate their performance.

File Descriptions

experiments.ipynb: Jupyter notebook containing experimental analysis including data exploration, model selection, and hyperparameter search.
model.py: Python script with the final model implementation, ready for submission.

How to Run

Load and run experiments.ipynb for experimental analysis and model selection.
Run model.py to train and test the final model using selected parameters from the experiment.

Dependencies

Numpy
Pandas
scikit-learn
XGBoost
LightGBM
imbalanced-learn
seaborn
LazyPredict

Conclusion

This assignment provides hands-on experience in Kaggle-style competitions, involving classifier selection, data preprocessing, model development, and hyperparameter search.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
.DS_Store		.DS_Store
Assignment 2.pdf		Assignment 2.pdf
README.md		README.md
experiments.ipynb		experiments.ipynb
lazy_predict.ipynb		lazy_predict.ipynb
lazy_predict_res.txt		lazy_predict_res.txt
model.py		model.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning Assignment - Kaggle-style Competition

Overview

Part 1: Experimental Analysis (`experiments.ipynb`)

Objective

Tasks

Part 2: Model Development (`model.py`)

Objective

Tasks

Solution Overview

File Descriptions

How to Run

Dependencies

Conclusion

About

Uh oh!

Releases

Packages

Languages

ofirkap/ml_ex2

Folders and files

Latest commit

History

Repository files navigation

Machine Learning Assignment - Kaggle-style Competition

Overview

Part 1: Experimental Analysis (experiments.ipynb)

Objective

Tasks

Part 2: Model Development (model.py)

Objective

Tasks

Solution Overview

File Descriptions

How to Run

Dependencies

Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Part 1: Experimental Analysis (`experiments.ipynb`)

Part 2: Model Development (`model.py`)

Packages