Skip to content

dascolari/starting_rotation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

103 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to Predipitch, our Github repository for predicting pitches in Major League Baseball. This is completed as a project for ECO395M, Data Mining and Machine Learning.

In this repository, we develop three models for predicting the next pitch in a baseball at-bat using random forests with different in-game features. A write-up explains our data, the methodology, the results, and some interesting conclusions about the models. We also create a Shiny App to allow the user to enter a game scenario and pitcher to see the likelihood of predicting a given pitch, which can be found here:

https://hsnell-6.shinyapps.io/DataMiningProject_PitchPrediction/

Reproducibility and Load Order

The following scripts allow the writeup to be reproduced:

  • dugout.R (similar to a include.R) loads the necessary libraries and establishes the file path
  • ⚠️ import.R converts the enormous dataset into a more manageable subset of MLB pitchers
  • ⚠️ pitchers.R pre-processes the data so that it is easily fed into the predipitch script
  • ⚠️ predi_pitch.R takes the subset of pitchers and creates a predictive model for each one (NOTE: Be advised, this will take upwards of 30 minutes to run)
  • performance.R creates a table that shows the performance of the models
  • kershaw_sequence.R looks at one pitcher and shows sequencing trends throughout the game in different scenarios
  • predipitch.Rmd creates the final write-up with visualizations

⚠️ Due to the size of the datasets at hand, running these scripts can take a long time. However, they output .RDs which can referenced directly from the Github so you may skip these scripts and run only the other scripts of interest. Just note, dugout.R should be run in either case.

A note about data:

Due to the size of datasets in this folder, we gitignore all .csv files. You can find some of the data in archive.zip. However, download pitches.zip from one of the below links to get pitches data

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages