Skip to content

lin1000/TwitterPublicAPI

Repository files navigation

From Twiiter Public Twitter API to Social Network Analysis Experiements

Configurations

In this sample, I will leave content of twitter4j.properties as dummy data. please replace the data in twitter4j.properties with your own and make sure the .jar and .properties in a same directory.

Feature List

  • (java) To connect to Public Twitter API using your own keys and secretid

  • (java) Given twitter handle, you can find the followers' handle list

  • (java) Twitter API Key Resoure Control by managing the concurrency and locking mechanism to maximize the rate litmit utilization

  • (java) Executor Thread pool to submit concurrent tasks

  • (python) Random Sampling Account and then output as csv file in 01SamplingAccount folder

  • (python) Read through full account list and then output as csv file in 01FullAccount folder

  • (python) Compose a gnip query rule with interested accounts that aligning with rule limitations

  • (python) Create a historical job that can sent to gnip

  • (python) Generate csv files group by rule tags

  • (spark) Generate json/csv files group by rule tags (accerelate processing speed by parallelizing)

    spark-submit --master "local[*]" --executor-memory 2G --total-executor-cores 20 06GNIPDataGroupByRuleTag-Spark.py > 06GNIPDataGroupByRuleTag-Spark.log 2>&1
    
  • (spark) Generate json/csv files filter by influencee account (accerelate processing speed by parallelizing)

  • (spark) Speark GraphX to analyze the social networking of random sampled followers

  • (java8) CountTweets

    export MAVEN_OPTS="-ea"
    mvn exec:java@0002 -Dexec.args="./output/collect-follower-day4/modelpress.followers.json Scanner"
    
  • (java8) CountTweetsParaller : Use parallels stream to parse json object

    export MAVEN_OPTS="-ea"
    mvn exec:java@0003 -Dexec.args="./output/collect-follower-day4/modelpress.followers.json Parallels"
    
  • (node v6) Mapbox visualization on followers home locations

Utility

  • (java8) Utility Class that getting directories and files resursively using stream. Here, in order to handle checked exception in stream chain , Throwables.propagate(e) in google guava library was used.
    mvn exec:java@0004
    

GitBook

  • Adding GitBook Integration (experimental)

Languages

Java (Stream, Concurrency, Twitter API)
Python (Data Processing)
Spark (Data Processing)
Node V6 (Mapbox Visualization)

About

This is a place for Public Twitter API experiments

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published