Skip to content

leonardogiovannoni/CloudComputing-Project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

run hadoop

generate bloom filters

cd hadoop/hadoop-optimized
mvn package
time hadoop jar target/cloud-project-1.0-SNAPSHOT.jar /inputs/title.basics.tsv /inputs/title.ratings.tsv /out/merge /out/count /out/bf 0.1 15

test bloom filters

cd hadoop/test-hadoop
mvn package
time hadoop jar target/test-hadoop-1.0-SNAPSHOT.jar /out/bf /out/merge /out/test-results

run spark

generate bloom filters

cd spark
zip -r pyfiles.zip util/
time spark-submit --py-files pyfiles.zip --num-executors 4 --executor-cores 2 main.py /inputs/ out/ 0.01 --no-test 2>/dev/null

test bloom filters

cd spark
zip -r pyfiles.zip util/
time spark-submit --py-files pyfiles.zip --num-executors 4 --executor-cores 2 main.py /inputs/ out/ 0.01 --no-calculate 2>/dev/null

About

Project for the cloud computing project, using hadoop + spark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Java 86.2%
  • Python 13.8%