[SPARK-54350][SQL][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based #53062

pan3793 · 2025-11-14T11:50:28Z

What changes were proposed in this pull request?

The SparkGetColumnsOperation is mainly used for the JDBC driver, while JDBC uses 1-based ordinal/column-index instead of 0-based.

This is also documented in Hive API.

spark/sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java

Lines 94 to 95 in 551b922

    
           .addPrimitiveColumn("ORDINAL_POSITION", Type.INT_TYPE, 
        
               "Index of column in table (starting at 1)")

Note, the GetColumnsOperation, which is originally copied from the Hive has a correct implementation, the issue only exists in SparkGetColumnsOperation.

For safety, a config spark.sql.legacy.hive.thriftServer.useZeroBasedColumnOrdinalPosition is added to allow the user to switch back to the previous behavior.

Why are the changes needed?

The SparkGetColumnsOperation is mainly used by JDBC java.sql.DatabaseMetaData#getColumns, this change makes it satisfy the JDBC API specification.

Does this PR introduce any user-facing change?

Yes, see the above section.

How was this patch tested?

UTs are modified.

Was this patch authored or co-authored using generative AI tooling?

No.

…e 1-based

dongjoon-hyun

cc @yaooqinn , @LuciferYang

dongjoon-hyun · 2025-11-14T21:15:14Z

...tserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala

          null, // SQL_DATETIME_SUB
          null, // CHAR_OCTET_LENGTH
-          pos.asInstanceOf[AnyRef], // ORDINAL_POSITION
+          (pos + 1).asInstanceOf[AnyRef], // ORDINAL_POSITION, 1-based


Although this looks like a legit fix to follow the standard, do you think we need to document this as a breaking change or provide a legacy conf, @pan3793 ?

+1 for @dongjoon-hyun 's suggestion

Sorry for late reply. I think it should be low risk, AFAIK, this is mostly used by tools (RDBMS management GUI tools or BI tools) to sort the column, and the implementation is relatively lenient, but for safety, I can add a legacy config and fix this only for 4.1.

dongjoon-hyun · 2025-11-14T21:16:04Z

...tserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkGetColumnsOperation.scala

          null, // SQL_DATETIME_SUB
          null, // CHAR_OCTET_LENGTH
-          pos.asInstanceOf[AnyRef], // ORDINAL_POSITION
+          (pos + 1).asInstanceOf[AnyRef], // ORDINAL_POSITION, 1-based


I'm wondering if this PR is breaking the existing BI tool integration.

pan3793 · 2025-11-18T08:09:06Z

@dongjoon-hyun @LuciferYang I added a config spark.sql.legacy.hive.thriftServer.useZeroBasedColumnOrdinalPosition to allow the user to switch back to the previous behavior, please take another look, thank you in advance.

dongjoon-hyun

+1, LGTM. Thank you, @pan3793 and @LuciferYang .

Merged to master/4.1 for Apache Spark 4.1.0.

…uld be 1-based ### What changes were proposed in this pull request? The SparkGetColumnsOperation is mainly used for the JDBC driver, while JDBC uses 1-based ordinal/column-index instead of 0-based. This is also documented in Hive API. https://github.com/apache/spark/blob/551b922a53acfdfeb2c065d5dedf35cb8cd30e1d/sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java#L94-L95 Note, the GetColumnsOperation, which is originally copied from the Hive has a correct implementation, the issue only exists in SparkGetColumnsOperation. For safety, a config `spark.sql.legacy.hive.thriftServer.useZeroBasedColumnOrdinalPosition` is added to allow the user to switch back to the previous behavior. ### Why are the changes needed? The SparkGetColumnsOperation is mainly used by JDBC [java.sql.DatabaseMetaData#getColumns](https://docs.oracle.com/en/java/javase/17/docs/api/java.sql/java/sql/DatabaseMetaData.html#getColumns(java.lang.String,java.lang.String,java.lang.String,java.lang.String)), this change makes it satisfy the JDBC API specification. ### Does this PR introduce _any_ user-facing change? Yes, see the above section. ### How was this patch tested? UTs are modified. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #53062 from pan3793/SPARK-54350. Authored-by: Cheng Pan <chengpan@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 05bc5d4) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

[SPARK-54350][STS] SparkGetColumnsOperation ORDINAL_POSITION should b…

3470b81

…e 1-based

github-actions bot added the SQL label Nov 14, 2025

pan3793 changed the title ~~[SPARK-54350][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based~~ [SPARK-54350][SQL][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based Nov 14, 2025

dongjoon-hyun approved these changes Nov 14, 2025

View reviewed changes

dongjoon-hyun reviewed Nov 14, 2025

View reviewed changes

useZeroBasedColumnOrdinalPosition config

c719d72

github-actions bot added the DOCS label Nov 18, 2025

use view for testing

558392a

dongjoon-hyun approved these changes Nov 18, 2025

View reviewed changes

dongjoon-hyun closed this in 05bc5d4 Nov 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-54350][SQL][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based #53062

[SPARK-54350][SQL][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based #53062

pan3793 commented Nov 14, 2025 •

edited

Loading

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun Nov 14, 2025

Uh oh!

LuciferYang Nov 15, 2025

Uh oh!

pan3793 Nov 18, 2025

Uh oh!

dongjoon-hyun Nov 14, 2025

Uh oh!

pan3793 commented Nov 18, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	.addPrimitiveColumn("ORDINAL_POSITION", Type.INT_TYPE,
	"Index of column in table (starting at 1)")

[SPARK-54350][SQL][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based #53062

[SPARK-54350][SQL][STS] SparkGetColumnsOperation ORDINAL_POSITION should be 1-based #53062

Conversation

pan3793 commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

pan3793 Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

pan3793 commented Nov 18, 2025

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pan3793 commented Nov 14, 2025 •

edited

Loading