Skip to content

[SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV cast null values properly#14118

Closed
lw-lin wants to merge 6 commits intoapache:masterfrom
lw-lin:csv-cast-null
Closed

[SPARK-16462][SPARK-16460][SPARK-15144][SQL] Make CSV cast null values properly#14118
lw-lin wants to merge 6 commits intoapache:masterfrom
lw-lin:csv-cast-null

Conversation

@lw-lin
Copy link
Contributor

@lw-lin lw-lin commented Jul 9, 2016

Problem

CSV in Spark 2.0.0:

  • does not read null values back correctly for certain data types such as Boolean, TimestampType, DateType -- this is a regression comparing to 1.6;
  • does not read empty values (specified by options.nullValue) as nulls for StringType -- this is compatible with 1.6 but leads to problems like SPARK-16903.

What changes were proposed in this pull request?

This patch makes changes to read all empty values back as nulls.

How was this patch tested?

New test cases.

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.