saving json data , partition by specific field (timestamp) 

I have a question, data in kafka is in json format, in each event I have a field called"eventTimestamp" which is a long number which represents the event time , I want to save the data in s3 in hourly bucket based on that timestamp, not the time the event was added to Kafka 

my settings when I used Kafka s3 connect are : 

connector.class=io.confluent.connect.s3.S3SinkConnector
storage.class=io.confluent.connect.s3.storage.S3Storage
format.class=io.confluent.connect.s3.format.json.JsonFormat
schema.generator.class=io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator
partitioner.class=io.confluent.connect.storage.partitioner.TimeBasedPartitioner
timestamp.extractor=RecordField
path.format='year'=YYYY/'month'=MM/'day'=dd/'hour'=HH
timestamp.field=eventTimestamp
partition.duration.ms=10
locale=en_IN
timezone=UTC


I see that  streamx support TimeBasedPartitioner but if I understand it can only support to extract  RecordField from parquet or avro not from json 

Is it possible to do it with json ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

saving json data , partition by specific field (timestamp) #58

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

saving json data , partition by specific field (timestamp) #58

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions