Skip to content

chaitanya0403/PBD_Project_2016

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PBD_Project_2016

Principles of Bigdata Management, Spring 2016 Project

Environment: Pyspark(Apache Spark)

Visualization: D3.js

Language: Python

Datasets:

Dataset I: Tweets collected over a period of week on Different Payment Technologies like Apple Pay, Samsung Pay, Android Pay, PayPal etc.

Dataset II: Tweets collected about the movie “Batman v Superman: Dawn of Justice”

Dataset III: Public Dataset from U.S. Department of Education on all Accredited Universities in U.S.A.

Queries:

Query I: Number of Tweets for each Payment Technology over a week in Dataset I.

Query II: Tweet count on each day over a period of week for all Payment Technologies in Dataset I.

Query III: Number of Twitter accounts created according to month & year from Dataset I.

Query IV:Top 10 Verified Accounts with Highest Follower Count in Dataset I.

Query V: List of all the Languages that were used to tweet and its count on Dataset - II.

Query VI: Percentage of Tweets with External Links in Tweet Status for Dataset II.

Query VII:Top ten liked tweets with like count for Dataset II.

Query VIII:Number of Colleges in each state for Dataset III.

Query IX:Ratio of Accreditation Types of all Colleges in Dataset III.

Team Members:

Sri Chaitanya Patluri

Sai Venkatesh Gatiganti

Meghasai Reddy Bodimani

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors