Summary
In Spark 4.1.0, the ExpressionWithRandomSeed trait added a new abstract method withShiftedSeed.
Details
- Spark Version: 4.1.0
- Change Type: Trait API addition
Spark ≤4.0.x
trait ExpressionWithRandomSeed extends Expression {
def seedExpression: Expression
def withNewSeed(seed: Long): Expression
}
Spark 4.1.0+
trait ExpressionWithRandomSeed extends Expression {
def seedExpression: Expression
def withNewSeed(seed: Long): Expression
def withShiftedSeed(shift: Long): Expression // NEW
}
Impact
Classes extending ExpressionWithRandomSeed that don't implement withShiftedSeed will fail to compile:
class GpuRand needs to be abstract. Missing implementation for member of trait ExpressionWithRandomSeed:
def withShiftedSeed(shift: Long): org.apache.spark.sql.catalyst.expressions.Expression = ???
Affected Files
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/catalyst/expressions/GpuRandomExpressions.scala
Solution
Add the missing method implementation in GpuRand:
case class GpuRand(child: Expression, doContextCheck: Boolean) extends ShimUnaryExpression
with ExpectsInputTypes with ExpressionWithRandomSeed with GpuExpressionRetryable {
// Add this for Spark 4.1.0 compatibility
override def withShiftedSeed(shift: Long): Expression =
throw new NotImplementedError("withShiftedSeed not supported on GPU")
// ... rest of implementation
}
Note: This implementation throws NotImplementedError as GPU execution doesn't use seed shifting.
References
- Upstream Spark change added
withShiftedSeed for deterministic random number generation in distributed settings
Summary
In Spark 4.1.0, the
ExpressionWithRandomSeedtrait added a new abstract methodwithShiftedSeed.Details
Spark ≤4.0.x
Spark 4.1.0+
Impact
Classes extending
ExpressionWithRandomSeedthat don't implementwithShiftedSeedwill fail to compile:Affected Files
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/catalyst/expressions/GpuRandomExpressions.scalaSolution
Add the missing method implementation in
GpuRand:Note: This implementation throws
NotImplementedErroras GPU execution doesn't use seed shifting.References
withShiftedSeedfor deterministic random number generation in distributed settings