[SPARK-53406][SQL] Avoid unnecessary shuffle join in direct passthrough shuffle id #52443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

shujingyang-db wants to merge 8 commits into apache:master from shujingyang-db:shuffle-spec-direct-partition

+359 −5

Contributor

shujingyang-db commented Sep 25, 2025 •

edited

Loading

What changes were proposed in this pull request?

This PR implements compatibility checking for ShufflePartitionIdPassThrough partitioning to avoid unnecessary shuffle operations when both sides of a join use compatible direct partition ID pass-through.

Why are the changes needed?

Improve performance

Does this PR introduce any user-facing change?

No

How was this patch tested?

New unit tests

Was this patch authored or co-authored using generative AI tooling?

Yes

shujingyang-db added 3 commits

September 4, 2025 13:49


          init

9e6221d


          Merge remote-tracking branch 'spark/master' into shuffle-spec-direct-…

421fd2f

…partition


          init

62aa22e

github-actions bot added the SQL label

shujingyang-db marked this pull request as ready for review

September 25, 2025 06:51

shujingyang-db added 2 commits

September 24, 2025 23:57


          SinglePartitionShuffleSpec

d2fab0e


          lint

c53123d

zhengruifeng changed the title ~~[SPARK-53406] Avoid unnecessary shuffle join in direct passthrough shuffle id~~ [SPARK-53406][SQL] Avoid unnecessary shuffle join in direct passthrough shuffle id

cloud-fan reviewed

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala Outdated

    
                  numPartitions: Int) extends Expression with Partitioning with Unevaluable {

                // TODO(SPARK-53401): Support Shuffle Spec in Direct Partition ID Pass Through

                // We don't support creating partitioning for ShufflePartitionIdPassThrough.

Contributor

cloud-fan Sep 25, 2025

This comment should be put in ShufflePartitionIdPassThroughSpec#canCreatePartitioning

cloud-fan reviewed

View reviewed changes

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ShuffleSpecSuite.scala Outdated

    
                  )

                  // Mismatched key positions should be incompatible

                  val dist1 = ClusteredDistribution(Seq($"a", $"b"))

Contributor

cloud-fan Sep 25, 2025

This is the same as dist?

Member

szehon-ho Oct 1, 2025

I was going to wait for address of @cloud-fan's comments :) but i agree this test is a bit hard to read. The variable declaration is not so consistent (ie, on very top or above the check). Also as wenchen point out, some variable like ClusteredDistribution(Seq($"a", $"b")) is used somewhere but not everywhere, ClusteredDistribution(Seq($"c", $"d")) is repeated but not variable at all.

How about, we make some better names too, like ab, cd, a, b? (or a bit longer if necessary)

checkCompatible(
ShufflePartitionIdPassThrough(b, 10).createDist(ab),
ShuffelPartitionIdPassThrough(c, 10).createDist(cd)
expected = false
)

Contributor Author

shujingyang-db Oct 1, 2025

thanks for calling it out! I followed this and updated all tests in this pr

cloud-fan reviewed

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala Outdated

    
                      newChildren.isDefined

                    }

                    val isShufflePassThroughCompatible = !isKeyGroupCompatible &&

Contributor

cloud-fan Sep 25, 2025

shall we combine this with the key group compatibility check?

      // Check if the following conditions are satisfied:
      //   1. There are exactly two children (e.g., join). Note that Spark doesn't support
      //      multi-way join at the moment, so this check should be sufficient.
      //   2. All children are of the same partitioning, and they are compatible with each other
      // If both are true, skip shuffle.
      val areChildrenCompatible = parent.isDefined &&
          children.length == 2 && childrenIndexes.length == 2 && {
        key group check and id pass through check
      }

cloud-fan reviewed

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala Outdated

    
                  TransformExpression(BucketFunction, expr, Some(numBuckets))

                }

                test("ShufflePartitionIdPassThrough - avoid necessary shuffle when they are compatible") {

Contributor

cloud-fan Sep 25, 2025 •

edited

Loading

Suggested change

      
              test("ShufflePartitionIdPassThrough - avoid necessary shuffle when they are compatible") {
          
              test("ShufflePartitionIdPassThrough - avoid unnecessary shuffle when children are compatible") {

cloud-fan reviewed

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala Outdated

    
                    val plan2 = DummySparkPlan(

                      outputPartitioning = ShufflePartitionIdPassThrough(DirectShufflePartitionID(exprC), 5))

                    // Join on different keys than partitioning keys

                    val smjExec = SortMergeJoinExec(exprB :: Nil, exprD :: Nil, Inner, None, plan1, plan2)

Contributor

cloud-fan Sep 25, 2025

Suggested change

      
                  val smjExec = SortMergeJoinExec(exprB :: Nil, exprD :: Nil, Inner, None, plan1, plan2)
          
                  val smjExec = SortMergeJoinExec(exprA :: exprB :: Nil, exprD :: exprC :: Nil, Inner, None, plan1, plan2)

cloud-fan reviewed

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala

    
                      case SortMergeJoinExec(_, _, _, _,

                        SortExec(_, _, _: DummySparkPlan, _),

                        SortExec(_, _, ShuffleExchangeExec(_: HashPartitioning, _, _, _), _), _) =>

                        // Right side shuffled, left side kept as-is

Contributor

cloud-fan Sep 25, 2025

the test result is random and these two cases can both happen?

Member

szehon-ho Oct 1, 2025

Agree, i dont get why the return value match both

cloud-fan reviewed

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala

    
                  }

                }

                test("ShufflePartitionIdPassThrough - cross position matching behavior") {

Contributor

cloud-fan Sep 25, 2025

This looks the same as ShufflePartitionIdPassThrough incompatibility - key position mismatch?

cloud-fan reviewed

View reviewed changes

sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala Outdated

    
                  }

                }

                test("ShufflePartitionIdPassThrough - compatible when partition key matches at any position") {

Contributor

cloud-fan Sep 25, 2025

can we merge this test case into ShufflePartitionIdPassThrough - compatible with multiple clustering keys?


          address comments

4e52066

szehon-ho reviewed

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala Outdated

    
                    // as the partitioning expression, we check compatibility as follows:

                    // 1. Same number of clustering expressions

                    // 2. Same number of partitions

                    // 3. each pair of partitioning expression from both sides has overlapping positions in their

Member

szehon-ho Sep 26, 2025

sorry, im a bit new to this and curious, do we need to actually check the partition expression of repartitionByExpression() is compatible or not (like for KeyGroupedPartitioning) before we can skip shuffle, or I miss something ?

Contributor

cloud-fan Sep 26, 2025

We don't need to, as it's just an expression without any data so it's more similar to the hash partitioning, while key grouped partitioning has actual partition data.

Member

szehon-ho Sep 27, 2025

ah got it, i read the comment on DirectShufflePartitionID that explain the limitation on the child expression, didnt see that earlier.

shujingyang-db requested review from cloud-fan and szehon-ho

September 26, 2025 18:40

szehon-ho reviewed

View reviewed changes

Member

szehon-ho left a comment

Code changes sense to me

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala Outdated Show resolved Hide resolved

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ShuffleSpecSuite.scala Outdated Show resolved Hide resolved


          address comments

69df835

shujingyang-db requested a review from szehon-ho

September 29, 2025 07:03

ckp

f907b5b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

SQL