MS in Data Science – New York University

DSGA1004: Big Data

Instructors: Professor Juliana Freire, Dr. Erin Carson, Dr. Nick Knight

Text: Mining of Massive Data Sets by Anand Rajaraman, Jure Leskovec and Jeff Ullman.

The objective of this course is to study the foundations of data storage and processing at scale.

Concepts and tools used in teh course include:

Relational algebra
SQL
Distributed File Systems and MapReduce
Apache Hadoop and Apache Spark
Amazon Web Services
Algorithms for: Finding similar items, frequent itemsets

You can find an overview and details on the course website: https://vgc.poly.edu/~juliana/courses/BigData2016/

Coursework:

Programming Assignments (35% - Individual)

Assignment 1: Querying NYC Taxi data using SQL
Assignment 2: NYC Taxi data processing using Map/Reduce (Hadoop)

Project (25% - Group)

Group: Maria Leonor Zamora Maass mzm239@nyu.edu, Luisa Eugenia Quispe Ortiz lqo202@nyu.edu

The objective of the term project was to analyze a massive dataset using the concepts learned in the course. We decided to analyze taxi data, in particular we focused on the analysis of short trips (those that could have been made by foot or bike).

The final report for this project can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assignment_1		assignment_1
assignment_2		assignment_2
labs		labs
term_project		term_project
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MS in Data Science – New York University

DSGA1004: Big Data

Coursework:

Programming Assignments (35% - Individual)

Project (25% - Group)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MS in Data Science – New York University

DSGA1004: Big Data

Coursework:

Programming Assignments (35% - Individual)

Project (25% - Group)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages