Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni by thirtiseven · Pull Request #14651 · NVIDIA/spark-rapids

thirtiseven · 2026-04-23T04:27:29Z

Description

PR #12699 worked around the data corruption reported in #10062 by forcing isAcceleratedTransposeSupported to false on every GPU architecture, disabling the fast AcceleratedColumnarToRowIterator path unconditionally. The root cause — an off-by-one in detail::determine_tiles that silently dropped the trailing column of the table whenever that column alone tipped the tile's row-size estimate over shmem_limit_per_tile — has since been fixed in NVIDIA/spark-rapids-jni#4493. With that JNI fix in place, the workaround is no longer needed, so this PR restores the original post-Pascal gating (Cuda.getComputeCapabilityMajor > 6).

Behaviorally this is a no-op for users — no new configuration, no plan changes. They will just see faster collect / take on wide fixed-width schemas.

Not safe to merge until NVIDIA/spark-rapids-jni#4493 has landed and the JNI SNAPSHOT that spark-rapids depends on has picked it up. Hence the [DO NOT MERGE] prefix.

Checklists

Documentation

Updated for new or modified user-facing features or behaviors
No user-facing change

Testing

Added or modified tests to cover new code paths
Covered by existing tests
(test_hash_multiple_grpby_pivot with DATAGEN_SEED=1702610203 — the original reproducer from [BUG] test_hash_multiple_grpby_pivot DATAGEN_SEED=1702610203 fails #10062 — exercises this path on a 192-column pivot schema; the matching regression test ColumnToRowTests.PivotLikeLayout lives in spark-rapids-jni alongside the kernel fix.)
Not required

Performance

Tests ran and results are added in the PR description
Issue filed with a link in the PR description
Not required

The workaround from NVIDIA#12699 forced isAcceleratedTransposeSupported to false on every architecture to avoid the data corruption in NVIDIA#10062. That has now been fixed upstream in spark-rapids-jni by correcting the off-by-one in detail::determine_tiles (NVIDIA/spark-rapids-jni#4493), so restore the original post-Pascal gating. Closes NVIDIA#10062

greptile-apps · 2026-04-23T04:28:56Z

Greptile Summary

Removes the unconditional false workaround in isAcceleratedTransposeSupported (added in #12699 for issue #10062) and restores the original Cuda.getComputeCapabilityMajor > 6 post-Pascal gate, along with the Cuda import. This re-enables the AcceleratedColumnarToRowIterator path on Volta+ GPUs after the off-by-one in detail::determine_tiles was fixed upstream in spark-rapids-jni#4493. The PR is explicitly marked [DO NOT MERGE] pending that JNI fix landing in the consumed SNAPSHOT, which is the correct guard for this change.

Confidence Score: 5/5

Safe to merge once the upstream spark-rapids-jni#4493 fix is present in the consumed SNAPSHOT — no logic or resource management concerns in this change itself.

Single-line logic restoration with a correctly re-added import and updated copyright; only finding is a stale P2 comment. No resource lifecycle, OOM retry, or data correctness concerns introduced.

No files require special attention beyond verifying the JNI SNAPSHOT dependency.

Important Files Changed

Filename	Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuColumnarToRowExec.scala	Restores `Cuda.getComputeCapabilityMajor > 6` gate in `isAcceleratedTransposeSupported`; adds back the `Cuda` import; stale workaround comment is a minor P2

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[makeIteratorFunc called] --> B{CudfRowTransitions.areAllSupported\nAND 4 < columns <= 100M?}
    B -- No --> C[ColumnarToRowIterator\n slow path]
    B -- Yes --> D{isAcceleratedTransposeSupported\nCuda.getComputeCapabilityMajor > 6?}
    D -- No\nPascal or older --> C
    D -- Yes\nVolta+ --> E[AcceleratedColumnarToRowIterator\n fast GPU path re-enabled by this PR]

_{Reviews (3): Last reviewed commit: "signoff" | Re-trigger Greptile}

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

thirtiseven · 2026-05-05T01:24:32Z

build

thirtiseven self-assigned this Apr 23, 2026

signoff

8ee1db2

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>

thirtiseven marked this pull request as draft April 23, 2026 04:30

thirtiseven changed the title ~~[DO NOT MERGE] Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni~~ Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni Apr 27, 2026

thirtiseven marked this pull request as ready for review April 27, 2026 02:07

thirtiseven requested a review from revans2 April 27, 2026 02:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni#14651

Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni#14651
thirtiseven wants to merge 2 commits intoNVIDIA:mainfrom
thirtiseven:revert-c2r-workaround-10062

thirtiseven commented Apr 23, 2026

Uh oh!

greptile-apps Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

thirtiseven commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thirtiseven commented Apr 23, 2026

Description

Checklists

Uh oh!

greptile-apps Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

thirtiseven commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 23, 2026 •

edited

Loading