Skip to content
grudelsud edited this page Sep 25, 2011 · 1 revision

Pre-Requirements

Have a server running somewhere in the cloud. I have initially started with a micro instance on Amazon EC2 (as I do not have the budget to run a large one) but the specs are not enough to run the algorithms. Try to have a super fast internet connection (my 30MB/s DSL at home is good for testing, but it drops packets from time to time), at least 2GB of memory and a decent CPU. Storage space is not super important, but consider that a Spritzer access on Twitter will store around 10GB/month.

Any Linux installation should run everything without problems, at the moment I have 1 system running happily on Ubuntu 10.04 LTS, just be sure to have the following installed:

  • Apache 2 and PHP 5.3 (or NGINX with PHP-FastCGI if you like to feel the breeze of speed)
  • Java, I have not tested OpenJDK, but I do not see any particular reason why it should not work with it. It is running without issues with Sun Java 6, just find your own distro and try it
  • git to clone this repo somewhere on the server
  • download wikipedia-miner archive and extract to /java/fom/data/wikiminer (this step not mandatory if you are planning to use the streaming API for topic extraction only

Create databases

  1. empty database for wikiminer (usually called "fom_wikiminer"), will be filled later
  2. empty database for fom (usually called "fom_fom"), then fill if with/doc/db/fom-XX.sql where the greater XX the better (it still ongoing, hence the db version is subject to change. XX = 12 on Jul 31st)

Setup config files

This files will be used to setup database connections and the twitter application used to fetch data from the streaming API.

capture/analysis tool

  1. create your desktop app on dev.twitter.com
  2. copy values for TwitterOAuth ConsumerKey, ConsumerSecret, AccessToken, AccessTokenSecret under /java/fom/data/properties.properties
  3. setup database connection (both for fom and wikiminer) under /java/fom/data/properties.properties

web front-end

  1. setup database connection under /php/application/fom/config/database.php

Wikipedia miner (optional)

used to expand twitter queries (when fom.jar is used to fetch data on a query-based setup)

  1. download archive from wikipedia miner
  2. expand under /java/fom/data/wikiminer
  3. execute the firs run setup with: java -jar fom.jar --firstRun

Clone this wiki locally