A dashboard to analyze transactions on the Ethereum blockchain in real-time. We used the publicly available crypto_ethereum dataset from Bigquery. The data was downloaded and stored locally as csv files.
scala 2.12.15sbt 1.5.8Apache Spark 3.2.1dockeranddocker-compose
-
Clone and
cdinto the repo -
Use
docker-compose up -dto start all the containers in a detached mode. The configs are defined in thedocker-compose.ymlfile. The services will be exposed to the following ports:Zookeeper(required for kafka): 2181Kafka: 9092Superset: 8088Redis(to enable persistence of our dashboards)Mysql: 3306
-
Setup Apache-Superset through the following command (this will configure superset and you will be able to connect to it on port 8088:
docker exec -it superset superset-init
-
Use this sequence of commands to start the Producer adn Consumer scripts:
-
sbt assembly-> This will create the .jar file for the project using the assembly plugin -
Run the consumer using:
-
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1 --master localhost --class "edu.neu.ethanalyzer.StreamingConsumer" ./target/scala-2.12/EthereumAnalytics-assembly-1.0.jar -
Producer using:
-
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1 --master localhost --class "edu.neu.ethanalyzer.DataProducer" ./target/scala-2.12/EthereumAnalytics-assembly-1.0.jar
-
-
We have also created a BatchConsumer class to analyze the data in totality. Run it using:
spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.2.1 --master localhost --class "edu.neu.ethanalyzer.BatchConsumer" ./target/scala-2.12/EthereumAnalytics-assembly-1.0.jar

