Welcome to the Big Data Programming project. This repository contains the code and resources for my big data project focused on
To run the code in this project, you will need the following: You will need the two datasets "Trips_by_Distance.csv" and "Trips_Full_Data" on your device. Contains codes to visualize all of the results recieved from the actions done on the datasets. This repository contains different codes for different actions done on the datasets including data cleaning, data categorization, and many more.
To use this project, follow these steps: All of the codes provided are different codes that perform different activities on the data You can change the specific data the actions are performed on if you would like to.
The data used in this project can be found in the Aula. The dataset can be found on the Aula on the 'Big Data Programming Project' page under the assessment page.