CS838(Data Science Principles)-Project

This is the repo for our Computer Sciences 838 course. Detailed information about the project and progress by stage can be found here

Project Goals and Scope

The focus of our project will be to gain insights into movies and trends. We will extract data from two movie databases. The first one we take from Kaggle here. The second data source can be found here

Our main research question focuses on our ability to predict the imdb_score from our movies_metadata.csv dataset. We will use the variables in this datast in conjunction with the film.csv dataset. We will also attempt to improve our results by exploring some basic sentiment analysis on our final model.

Data Extraction and Cleaning

Code to perform our data cleaning and merging can be found in the ETL folder, where the readme details our process.

Project Stages

Detailed information on project pages can be found on the course website here

This repo is organized in folders with the respective project stages. Detailed information can again be found on the website.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
stage1		stage1
stage2		stage2
stage3		stage3
stage4		stage4
stage5		stage5
web		web
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS838(Data Science Principles)-Project

Project Goals and Scope

Data Extraction and Cleaning

Project Stages

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CS838(Data Science Principles)-Project

Project Goals and Scope

Data Extraction and Cleaning

Project Stages

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages