Skip to content

Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni#14651

Open
thirtiseven wants to merge 2 commits intoNVIDIA:mainfrom
thirtiseven:revert-c2r-workaround-10062
Open

Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni#14651
thirtiseven wants to merge 2 commits intoNVIDIA:mainfrom
thirtiseven:revert-c2r-workaround-10062

Conversation

@thirtiseven
Copy link
Copy Markdown
Collaborator

Fixes #10062.

Description

PR #12699 worked around the data corruption reported in #10062 by forcing isAcceleratedTransposeSupported to false on every GPU architecture, disabling the fast AcceleratedColumnarToRowIterator path unconditionally. The root cause — an off-by-one in detail::determine_tiles that silently dropped the trailing column of the table whenever that column alone tipped the tile's row-size estimate over shmem_limit_per_tile — has since been fixed in NVIDIA/spark-rapids-jni#4493. With that JNI fix in place, the workaround is no longer needed, so this PR restores the original post-Pascal gating (Cuda.getComputeCapabilityMajor > 6).

Behaviorally this is a no-op for users — no new configuration, no plan changes. They will just see faster collect / take on wide fixed-width schemas.

Not safe to merge until NVIDIA/spark-rapids-jni#4493 has landed and the JNI SNAPSHOT that spark-rapids depends on has picked it up. Hence the [DO NOT MERGE] prefix.

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (test_hash_multiple_grpby_pivot with DATAGEN_SEED=1702610203 — the original reproducer from [BUG] test_hash_multiple_grpby_pivot DATAGEN_SEED=1702610203 fails #10062 — exercises this path on a 192-column pivot schema; the matching regression test ColumnToRowTests.PivotLikeLayout lives in spark-rapids-jni alongside the kernel fix.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

The workaround from NVIDIA#12699 forced isAcceleratedTransposeSupported to
false on every architecture to avoid the data corruption in NVIDIA#10062.
That has now been fixed upstream in spark-rapids-jni by correcting the
off-by-one in detail::determine_tiles (NVIDIA/spark-rapids-jni#4493),
so restore the original post-Pascal gating.

Closes NVIDIA#10062
@thirtiseven thirtiseven self-assigned this Apr 23, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

Removes the unconditional false workaround in isAcceleratedTransposeSupported (added in #12699 for issue #10062) and restores the original Cuda.getComputeCapabilityMajor > 6 post-Pascal gate, along with the Cuda import. This re-enables the AcceleratedColumnarToRowIterator path on Volta+ GPUs after the off-by-one in detail::determine_tiles was fixed upstream in spark-rapids-jni#4493. The PR is explicitly marked [DO NOT MERGE] pending that JNI fix landing in the consumed SNAPSHOT, which is the correct guard for this change.

Confidence Score: 5/5

Safe to merge once the upstream spark-rapids-jni#4493 fix is present in the consumed SNAPSHOT — no logic or resource management concerns in this change itself.

Single-line logic restoration with a correctly re-added import and updated copyright; only finding is a stale P2 comment. No resource lifecycle, OOM retry, or data correctness concerns introduced.

No files require special attention beyond verifying the JNI SNAPSHOT dependency.

Important Files Changed

Filename Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuColumnarToRowExec.scala Restores Cuda.getComputeCapabilityMajor > 6 gate in isAcceleratedTransposeSupported; adds back the Cuda import; stale workaround comment is a minor P2

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[makeIteratorFunc called] --> B{CudfRowTransitions.areAllSupported\nAND 4 < columns <= 100M?}
    B -- No --> C[ColumnarToRowIterator\n slow path]
    B -- Yes --> D{isAcceleratedTransposeSupported\nCuda.getComputeCapabilityMajor > 6?}
    D -- No\nPascal or older --> C
    D -- Yes\nVolta+ --> E[AcceleratedColumnarToRowIterator\n fast GPU path re-enabled by this PR]
Loading

Reviews (3): Last reviewed commit: "signoff" | Re-trigger Greptile

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
@thirtiseven thirtiseven marked this pull request as draft April 23, 2026 04:30
@thirtiseven thirtiseven changed the title [DO NOT MERGE] Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni Re-enable accelerated columnar-to-row path after fix in spark-rapids-jni Apr 27, 2026
@thirtiseven thirtiseven marked this pull request as ready for review April 27, 2026 02:07
@thirtiseven thirtiseven requested a review from revans2 April 27, 2026 02:08
@thirtiseven
Copy link
Copy Markdown
Collaborator Author

build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] test_hash_multiple_grpby_pivot DATAGEN_SEED=1702610203 fails

2 participants