This project serves as an example for teaching the HWE Course at 1904Labs.
- A Kafka producer publishes to the kafka topic
reviews. - A spark streaming application consumes reviews from the kafka topic. Within each review is a
customer_id. - The Spark streaming application joins each review with a record retrieved from Hbase, and uses this customer_ic to make that join.
- Spark streaming stores this enriched record in HDFS.
- Hive is used to query the data from hdfs.
