[FSTORE-1897] DELTA materialization Jobs fail if the feature group contains a struct #715
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue:
Spark fails with
org.apache.spark.sql.avro.IncompatibleSchemaException: Attempting to treat union as a RECORD, but it was: UNIONwhen trying to deserialize an avro schema of the format'["null", {"type": "record", "name": "S_col_object_dict_", "fields": [{"name": "k0", "type": ["null", "long"]}, {"name": "k1", "type": ["null", "long"]}]}]'Root Cause:
It seems that spark always expects the top level schema (jsonFormatSchema) provided to the from_avro function to be a record if it infers the type as a Struct Type.
From looking at the code it seem that spark always considers a union of null and a records as a Struct Type, and for struct Types spark uses the getRecordWriter function which expects the passed avro type to be a record and throws the exception.
Fix Done
Wrap the the union of struct into a record so that spark can deserialize it. This would result in a deseriialized struct of a struct which is then unnested to get the actual struct.
JIRA Issue: https://hopsworks.atlassian.net/browse/FSTORE-1897
Priority for Review: -
Related PRs: -
How Has This Been Tested?
Checklist For The Assigned Reviewer: