You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 4, 2019. It is now read-only.
I noticed I have to configure the hadoop config files like core-site.xml, hdfs-site.xml to configure S3. And I could not find the mentioned config/hadoop-conf in my installation (Kafka 0.10.2.0). So do I have to use HDFS in order to use this streamX?
What I am trying to do is to transform some messages in JSON format to parquet and then store them in S3.
Using spark could achieve this target but it would require a long-running cluster to do, or I can use the checkpoint to do a per day basic ETL.
I noticed I have to configure the hadoop config files like core-site.xml, hdfs-site.xml to configure S3. And I could not find the mentioned config/hadoop-conf in my installation (Kafka 0.10.2.0). So do I have to use HDFS in order to use this streamX?
What I am trying to do is to transform some messages in JSON format to parquet and then store them in S3.
Using spark could achieve this target but it would require a long-running cluster to do, or I can use the checkpoint to do a per day basic ETL.