Skip to content

tapmateus/ironkaggle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

IronKaggle

One day competition of Ironhack's Data Analytics bootcamp. Goal was to build a predicting model for sales that then was to be verified. Cleaned a raw dataset with data of sales from different stores, used feature engineering for feature selection and then applied two diferente models and compared the scores on both: xgboost and Random Forest Regressor. Weighted the bias / variance to decide on which to choose: chose the second.

Model later verified by the teacher on a new dataset and ended being the winner.


Technical Requirements

  • Data Cleaning and Manipulation: checking and dropping null values / rows / columns, dealing with duplicates, formatting and filtering data;
  • Combining and Structuring Data:
  • Data Aggregation and Filtering;
  • Libraries imported:
    • Pandas: import, export the shark_attack.csv - baseline for the project - and manipulate data;
    • matplotlib: plotting histograms to verify hypothesis;
    • Numpy;
    • Seaborn;
    • sklearn: metrics, ensemble and model_selection.

Resources

About

Ironhack's Sales Prediction Competition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors