Releases: MTSWebServices/spark-dialect-extension
0.0.4 (2026-04-07)
Features
Added support for df.write.format("jdbc").option("truncate", "true")
Improvements
Added tests for Clickhouse JDBC 0.9.5+.
This JDBC driver version allows using Array(T) for almost all T, including Float32, Date, DateTime and Decimal, see ClickHouse/clickhouse-java#2627.
Except for UInt64 - there is an issue on Spark side.
Bug fixes
Convert UInt64 to Decimalype(38, 0) instead of DecimalType(20, 0) (Spark's default).
0.0.3
-
Added support for Clickhouse JDBC 0.9.x.
This allows using
Array(T)for numericT, likeInt16,Int32,Int64,Float64.But
Date,DateTimeandDecimalare not supported, see issue. -
Wrap with
Nullable(T)Spark DataFrame columns withnullable = true.Caveat - Spark DataFrames created from ORC and Parquet files have all columns with
nullable = true.
Using:df.write.format("jdbc").option("createTableOptions", "ENGINE = ReplacingMergeTree() ORDER BY (col1)")
will fail if
col1is nullable. Workaround:import pyspark.sql.functions as F # make column non-nullable with coalesce # F.lit(...) should contain value compatible with `col1` type df = df.withColumn("a", F.coalesce("a", F.lit(0)))
0.0.2
- Allow writing
ArrayType(TimestampType())Spark column as Clickhouse'sArray(DateTime64(6)). - Allow writing
ArrayType(ShortType())Spark column as Clickhouse'sArray(Int16).
0.0.1
First release! 🎉
This version includes custom Clickhouse dialect for Apache Spark 3.5.x, with following enhancements:
- support for writing Spark's
ArrayTypeto Clickhouse. Currently only few types are supported, likeArrayType(StringType),ArrayType(ByteType),ArrayType(LongType),ArrayType(FloatType). Unfortunately, reading Arrays from Clickhouse to Spark is not fully supported for now. - fixed issue when writing Spark's
TimestampTypelead to creating Clickhouse table withDateTime64(0)instead ofDateTime64(6), resulting a precision loss (fractions of seconds were dropped). - fixed issue when writing Spark's
BooleanTypelead to creating Clickhouse table withUInt64column instead ofBool.