diff --git a/docs/additional-functionality/advanced_configs.md b/docs/additional-functionality/advanced_configs.md index 3915309d509..22f495347e0 100644 --- a/docs/additional-functionality/advanced_configs.md +++ b/docs/additional-functionality/advanced_configs.md @@ -197,6 +197,7 @@ Name | SQL Function(s) | Description | Default Value | Notes spark.rapids.sql.expression.Alias| |Gives a column a name|true|None| spark.rapids.sql.expression.And|`and`|Logical AND|true|None| spark.rapids.sql.expression.AnsiCast| |Convert a column of one type of data into another type|true|None| +spark.rapids.sql.expression.ArrayAggregate|`aggregate`|Aggregate elements in an array using an accumulator function and finishing transformation. Currently only lambdas of the form (acc, x) -> acc + g(x) with an identity finish are executed on the GPU; other shapes fall back to CPU.|true|None| spark.rapids.sql.expression.ArrayContains|`array_contains`|Returns a boolean if the array contains the passed in key|true|None| spark.rapids.sql.expression.ArrayDistinct|`array_distinct`|Removes duplicate values from the array|true|None| spark.rapids.sql.expression.ArrayExcept|`array_except`|Returns an array of the elements in array1 but not in array2, without duplicates|true|This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal, but the CPU implementation currently does not (see SPARK-39845). Also, Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+| diff --git a/docs/supported_ops.md b/docs/supported_ops.md index 3a8a59d466c..4c002359a0c 100644 --- a/docs/supported_ops.md +++ b/docs/supported_ops.md @@ -2357,6 +2357,154 @@ are limited. +ArrayAggregate +`aggregate` +Aggregate elements in an array using an accumulator function and finishing transformation. Currently only lambdas of the form (acc, x) -> op(acc, g(x)) with an identity finish are executed on the GPU, where op is one of SUM/PRODUCT/MAX/MIN/ALL/ANY. If/CaseWhen branches are accepted as long as each branch is itself op-of-acc (or bare acc) with op consistent across branches; other shapes fall back to CPU. +None +project +zero +S +S +S +S +S +S +S +S +PS
UTC is only supported TZ for TIMESTAMP
+S +S +NS +NS +NS +NS +NS +NS +NS +NS +NS + + +result +S +S +S +S +S +S +S +S +PS
UTC is only supported TZ for TIMESTAMP
+S +S +NS +NS +NS +NS +NS +NS +NS +NS +NS + + +finish +S +S +S +S +S +S +S +S +PS
UTC is only supported TZ for TIMESTAMP
+S +S +NS +NS +NS +NS +NS +NS +NS +NS +NS + + +merge +S +S +S +S +S +S +S +S +PS
UTC is only supported TZ for TIMESTAMP
+S +S +NS +NS +NS +NS +NS +NS +NS +NS +NS + + +argument + + + + + + + + + + + + + + +PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types CALENDAR, ARRAY, MAP, UDT, DAYTIME, YEARMONTH
+ + + + + + + +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + ArrayContains `array_contains` Returns a boolean if the array contains the passed in key @@ -2482,34 +2630,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - ArrayExcept `array_except` Returns an array of the elements in array1 but not in array2, without duplicates @@ -2806,6 +2926,34 @@ are limited. +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + ArrayJoin `array_join` Concatenates the elements of the given array using the delimiter and an optional string to replace nulls. If no value is set for nullReplacement, any null value is filtered. @@ -2903,34 +3051,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - ArrayMax `array_max` Returns the maximum value in the array @@ -3127,7 +3247,7 @@ are limited. NS NS NS -PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, UDT, DAYTIME, YEARMONTH
+PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, ARRAY, MAP, UDT, DAYTIME, YEARMONTH
NS NS NS @@ -3150,9 +3270,9 @@ are limited. S NS NS -PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, UDT, DAYTIME, YEARMONTH
-PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, UDT, DAYTIME, YEARMONTH
-PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, UDT, DAYTIME, YEARMONTH
+NS +NS +PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, ARRAY, MAP, UDT, DAYTIME, YEARMONTH
NS NS NS @@ -3173,7 +3293,7 @@ are limited. -PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, UDT, DAYTIME, YEARMONTH
+PS
UTC is only supported TZ for child TIMESTAMP;
unsupported child types BINARY, CALENDAR, ARRAY, MAP, UDT, DAYTIME, YEARMONTH
@@ -3255,6 +3375,34 @@ are limited. +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + ArrayTransform `transform` Transform elements in an array using the transform function. This is similar to a `map` in functional programming @@ -3329,34 +3477,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - ArrayUnion `array_union` Returns an array of the elements in the union of array1 and array2, without duplicates. @@ -3705,6 +3825,34 @@ are limited. +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + Asinh `asinh` Inverse hyperbolic sine @@ -3803,34 +3951,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - AtLeastNNonNulls Checks if number of non null/Nan values is greater than a given value @@ -4130,6 +4250,34 @@ are limited. S +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + BRound `bround` Round an expression to d decimal places using HALF_EVEN rounding mode @@ -4204,34 +4352,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - Bin `bin` Returns the string representation of the long value `expr` represented in binary @@ -4529,6 +4649,34 @@ are limited. +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + BitwiseNot `~` Returns the bitwise NOT of the operands @@ -4627,34 +4775,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - BitwiseOr `\|` Returns the bitwise OR of the operands @@ -4943,6 +5063,34 @@ are limited. +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + BloomFilterMightContain Bloom filter query @@ -5017,34 +5165,6 @@ are limited. -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - BoundReference Reference to a bound variable @@ -5371,6 +5491,34 @@ are limited. +Expression +SQL Functions(s) +Description +Notes +Context +Param/Output +BOOLEAN +BYTE +SHORT +INT +LONG +FLOAT +DOUBLE +DATE +TIMESTAMP +STRING +DECIMAL +NULL +BINARY +CALENDAR +ARRAY +MAP +STRUCT +UDT +DAYTIME +YEARMONTH + + Coalesce `coalesce` Returns the first non-null argument if exists. Otherwise, null @@ -5422,34 +5570,6 @@ are limited. S -Expression -SQL Functions(s) -Description -Notes -Context -Param/Output -BOOLEAN -BYTE -SHORT -INT -LONG -FLOAT -DOUBLE -DATE -TIMESTAMP -STRING -DECIMAL -NULL -BINARY -CALENDAR -ARRAY -MAP -STRUCT -UDT -DAYTIME -YEARMONTH - - Concat `concat` List/String concatenate diff --git a/integration_tests/src/main/python/higher_order_functions_test.py b/integration_tests/src/main/python/higher_order_functions_test.py index 23d61793b46..1d06e64b3cd 100644 --- a/integration_tests/src/main/python/higher_order_functions_test.py +++ b/integration_tests/src/main/python/higher_order_functions_test.py @@ -1,4 +1,4 @@ -# Copyright (c) 2023, NVIDIA CORPORATION. +# Copyright (c) 2023-2026, NVIDIA CORPORATION. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. @@ -12,8 +12,11 @@ # See the License for the specific language governing permissions and # limitations under the License. -from asserts import assert_gpu_and_cpu_are_equal_collect -from marks import ignore_order +import pytest + +from asserts import assert_gpu_and_cpu_are_equal_collect, assert_gpu_fallback_collect +from data_gen import * +from marks import allow_non_gpu, disable_ansi_mode, ignore_order @ignore_order(local=True) @@ -29,3 +32,257 @@ def do_project(spark): return df.selectExpr( "transform(c, (v, i) -> named_struct('x', c[i].x, 'y', c[i].y)) AS t") assert_gpu_and_cpu_are_equal_collect(do_project, conf=confs) + + +@pytest.mark.parametrize('lambda_sql, init_sql, gen_max', [ + ('(acc, x) -> acc + CAST(x as BIGINT)', '0L', 100), + ('(acc, x) -> acc * CAST(x as BIGINT)', '1L', 3), + ('(acc, x) -> greatest(acc, CAST(x as BIGINT))', '-9223372036854775808L', 100), + ('(acc, x) -> least(acc, CAST(x as BIGINT))', '9223372036854775807L', 100), +], ids=['sum', 'product', 'max', 'min']) +@disable_ansi_mode +def test_array_aggregate_numeric_ops(lambda_sql, init_sql, gen_max): + gen = IntegerGen(min_val=-gen_max, max_val=gen_max) + def do_it(spark): + return unary_op_df(spark, ArrayGen(gen, max_length=8)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@pytest.mark.parametrize('gen, lambda_sql, init_sql', [ + (IntegerGen(min_val=-100, max_val=100), '(acc, x) -> acc + x', '0'), + (LongGen(min_val=-100, max_val=100), '(acc, x) -> acc + x', '0L'), + (IntegerGen(min_val=-100, max_val=100), + '(acc, x) -> greatest(acc, x)', 'CAST(-9999 as INT)'), + (LongGen(min_val=-100, max_val=100), + '(acc, x) -> least(acc, x)', '9223372036854775807L'), +], ids=['int-sum', 'long-sum', 'int-max', 'long-min']) +@disable_ansi_mode +def test_array_aggregate_native_integer_ops(gen, lambda_sql, init_sql): + def do_it(spark): + return unary_op_df(spark, ArrayGen(gen, max_length=8)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +# Elements are non-null because the tag-time guard falls back to CPU when the element type +# is nullable. +@pytest.mark.parametrize('lambda_sql, init_sql', [ + ('(acc, x) -> acc AND x', 'true'), + ('(acc, x) -> acc OR x', 'false'), +], ids=['all', 'any']) +@disable_ansi_mode +def test_array_aggregate_boolean_ops(lambda_sql, init_sql): + non_null_bool = BooleanGen(nullable=False) + def do_it(spark): + return unary_op_df(spark, ArrayGen(non_null_bool, max_length=8)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@pytest.mark.parametrize('lambda_sql, init_sql', [ + ('(acc, x) -> acc AND x', 'true'), + ('(acc, x) -> acc OR x', 'false'), +], ids=['all', 'any']) +@disable_ansi_mode +@allow_non_gpu('ProjectExec') +def test_array_aggregate_boolean_ops_nullable_elements_fallback(lambda_sql, init_sql): + assert_gpu_fallback_collect( + lambda spark: unary_op_df(spark, ArrayGen(boolean_gen, max_length=8)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res'), + 'ArrayAggregate') + + +@disable_ansi_mode +def test_array_aggregate_count_if_int(): + assert_gpu_and_cpu_are_equal_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=15)).selectExpr( + 'aggregate(a, 0, (acc, x) -> acc + CASE WHEN x > 0 THEN 1 ELSE 0 END) as pos_cnt', + 'aggregate(a, 0L, (acc, x) -> acc + CAST(CASE WHEN x IS NULL THEN 1 ELSE 0 END as BIGINT)) as null_cnt')) + + +# `if(cond, acc + t, acc)` shape — branches lifted via op identity. Same count-if +# pattern as above but written naturally instead of using `CASE WHEN ... THEN 1 ELSE 0`. +@disable_ansi_mode +def test_array_aggregate_if_count(): + assert_gpu_and_cpu_are_equal_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=15)).selectExpr( + 'aggregate(a, 0L, (acc, x) -> if(x > 0, acc + 1L, acc)) as pos_cnt', + 'aggregate(a, 0L, (acc, x) -> if(x is null, acc, acc + 1L)) as nonnull_cnt')) + + +# CaseWhen with several acc+t branches and a bare-acc else. +@disable_ansi_mode +def test_array_aggregate_casewhen_multi_branch(): + assert_gpu_and_cpu_are_equal_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=15)).selectExpr( + '''aggregate(a, 0L, + (acc, x) -> CASE + WHEN x > 100 THEN acc + 10L + WHEN x > 0 THEN acc + 1L + ELSE acc + END) as weighted_cnt''')) + + +@disable_ansi_mode +def test_array_aggregate_with_filter_and_split(): + field_gen = StringGen('[a-z]{2}') + def do_it(spark): + df = unary_op_df(spark, ArrayGen(field_gen, max_length=5)) + return df.selectExpr(""" + aggregate( + filter(transform(a, x -> concat_ws(' ', x, x, x, x, x)), z -> z != ''), + 0L, + (acc, z) -> acc + CAST(CASE WHEN ( + size(split(z, ' ', -1)) > 2 + AND split(z, ' ', -1)[2] IN ('aa', 'bb') + AND NOT split(z, ' ', -1)[1] IN ('xx', 'yy') + ) THEN 1 ELSE 0 END as BIGINT), + id -> id + ) as cnt""") + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_non_zero_init(): + assert_gpu_and_cpu_are_equal_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=10)).selectExpr( + 'aggregate(a, 100L, (acc, x) -> acc + CAST(x as BIGINT)) as sum_with_init')) + + +@disable_ansi_mode +def test_array_aggregate_null_array(): + assert_gpu_and_cpu_are_equal_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, all_null=True)).selectExpr( + 'aggregate(a, 0L, (acc, x) -> acc + CAST(x as BIGINT)) as should_be_null')) + + +@disable_ansi_mode +def test_array_aggregate_empty_array(): + def do_it(spark): + return spark.createDataFrame( + [([1, 2, 3],), ([],), ([7],), ([],)], + 'a array').selectExpr( + 'aggregate(a, 42L, (acc, x) -> acc + CAST(x as BIGINT)) as sum_with_empty') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_lambda_refs_outer_column(): + def do_it(spark): + return two_col_df(spark, ArrayGen(int_gen, max_length=10), int_gen).selectExpr( + 'aggregate(a, 0L, (acc, x) -> acc + CAST(x + b as BIGINT)) as sum_with_outer') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_zero_is_outer_column(): + def do_it(spark): + return two_col_df(spark, ArrayGen(int_gen, max_length=10), long_gen).selectExpr( + 'aggregate(a, b, (acc, x) -> acc + CAST(x as BIGINT)) as sum_from_col') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_over_struct_field(): + def do_it(spark): + elem_gen = StructGen([['i', int_gen]], nullable=False) + return unary_op_df(spark, ArrayGen(elem_gen, max_length=10)).selectExpr( + 'aggregate(a, 0L, (acc, s) -> acc + CAST(s.i as BIGINT)) as sum_field') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_over_binary(): + # GpuLength only accepts STRING, so we hex(binary) → string first to keep the + # whole lambda on the GPU. Result: 2 × byte count of each element, summed. + def do_it(spark): + return unary_op_df(spark, ArrayGen(BinaryGen(max_length=10), max_length=8)).selectExpr( + 'aggregate(a, 0L, (acc, x) -> acc + CAST(length(hex(x)) as BIGINT)) as total_hex_len') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_deeper_g_body(): + assert_gpu_and_cpu_are_equal_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=10)).selectExpr( + 'aggregate(a, 0L, (acc, x) -> acc + CAST(x * 2 + 1 as BIGINT)) as sum_poly')) + + +# Long overflow wraps in non-ANSI mode on both Spark SUM and cuDF SUM. +@disable_ansi_mode +def test_array_aggregate_long_overflow_wraps(): + def do_it(spark): + big = LongGen(min_val=9223372036854775000, max_val=9223372036854775700, nullable=False) + return unary_op_df(spark, ArrayGen(big, min_length=5, max_length=15)).selectExpr( + 'aggregate(a, 0L, (acc, x) -> acc + x) as wrapped_sum') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@disable_ansi_mode +def test_array_aggregate_decimal_sum(): + decimal_gen = DecimalGen(precision=10, scale=2) + def do_it(spark): + return unary_op_df(spark, ArrayGen(decimal_gen, max_length=8)).selectExpr( + 'aggregate(a, CAST(0 as DECIMAL(38,2)), ' + '(acc, x) -> acc + CAST(x as DECIMAL(38,2))) as dec_sum') + assert_gpu_and_cpu_are_equal_collect(do_it) + + +@pytest.mark.parametrize('lambda_sql, init_sql', [ + ('(acc, x) -> acc - CAST(x as BIGINT)', '0L'), + ('(acc, x) -> CAST(acc / CAST(x + 1 as BIGINT) as BIGINT)', '1L'), + ('(acc, x) -> greatest(acc, CAST(x as BIGINT), CAST(x * 2 as BIGINT))', '-999L'), + ('(acc, x) -> acc + acc * CAST(x as BIGINT)', '0L'), +], ids=['subtract', 'divide', 'greatest-3ary', 'g-refs-acc']) +@disable_ansi_mode +@allow_non_gpu('ProjectExec') +def test_array_aggregate_fallback_shapes(lambda_sql, init_sql): + assert_gpu_fallback_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=5)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res'), + 'ArrayAggregate') + + +@disable_ansi_mode +@allow_non_gpu('ProjectExec') +def test_array_aggregate_non_identity_finish_falls_back(): + assert_gpu_fallback_collect( + lambda spark: unary_op_df(spark, ArrayGen(int_gen, max_length=5)).selectExpr( + 'aggregate(a, 0L, (acc, x) -> acc + CAST(x as BIGINT), acc -> acc * 2) as doubled'), + 'ArrayAggregate') + + +@pytest.mark.parametrize('lambda_sql, init_sql', [ + ('(acc, x) -> greatest(acc, x)', 'CAST("-Infinity" as DOUBLE)'), + ('(acc, x) -> least(acc, x)', 'CAST("Infinity" as DOUBLE)'), +], ids=['max', 'min']) +@disable_ansi_mode +@allow_non_gpu('ProjectExec') +def test_array_aggregate_double_extremum_falls_back(lambda_sql, init_sql): + assert_gpu_fallback_collect( + lambda spark: unary_op_df(spark, ArrayGen(double_gen, max_length=5)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res'), + 'ArrayAggregate') + + +# SUM / PRODUCT on FLOAT and DOUBLE: cuDF's parallel tree-reduction sums in a different +# order than Spark's sequential left-fold, so GPU vs CPU can differ in the low bits. Gated +# by `spark.rapids.sql.variableFloatAgg.enabled` (default true, same as scalar GpuSum) — +# we only verify the fallback path here, since the GPU path under default conf accepts +# minor numeric divergence and cannot use strict-equality assertions. +@pytest.mark.parametrize('elem_gen, lambda_sql, init_sql', [ + (float_gen, '(acc, x) -> acc + x', 'CAST(0 as FLOAT)'), + (double_gen, '(acc, x) -> acc + x', 'CAST(0 as DOUBLE)'), + (float_gen, '(acc, x) -> acc * x', 'CAST(1 as FLOAT)'), + (double_gen, '(acc, x) -> acc * x', 'CAST(1 as DOUBLE)'), +], ids=['float-sum', 'double-sum', 'float-product', 'double-product']) +@disable_ansi_mode +@allow_non_gpu('ProjectExec') +def test_array_aggregate_float_sum_product_falls_back_when_variable_float_agg_disabled( + elem_gen, lambda_sql, init_sql): + assert_gpu_fallback_collect( + lambda spark: unary_op_df(spark, ArrayGen(elem_gen, max_length=5)).selectExpr( + f'aggregate(a, {init_sql}, {lambda_sql}) as res'), + 'ArrayAggregate', + conf={'spark.rapids.sql.variableFloatAgg.enabled': 'false'}) diff --git a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala index 8790334e48a..3f991450081 100644 --- a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala +++ b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala @@ -2962,6 +2962,31 @@ object GpuOverrides extends Logging { ) } }), + expr[ArrayAggregate]( + "Aggregate elements in an array using an accumulator function and finishing " + + "transformation. Currently only lambdas of the form (acc, x) -> op(acc, g(x)) with " + + "an identity finish are executed on the GPU, where op is one of SUM/PRODUCT/MAX/" + + "MIN/ALL/ANY. If/CaseWhen branches are accepted as long as each branch is itself " + + "op-of-acc (or bare acc) with op consistent across branches; other shapes fall " + + "back to CPU.", + ExprChecks.projectOnly( + TypeSig.commonCudfTypes + TypeSig.DECIMAL_128, + TypeSig.all, + Seq( + ParamCheck("argument", + TypeSig.ARRAY.nested(TypeSig.commonCudfTypes + TypeSig.DECIMAL_128 + TypeSig.NULL + + TypeSig.BINARY + TypeSig.STRUCT), + TypeSig.ARRAY.nested(TypeSig.all)), + ParamCheck("zero", + TypeSig.commonCudfTypes + TypeSig.DECIMAL_128, + TypeSig.all), + ParamCheck("merge", + TypeSig.commonCudfTypes + TypeSig.DECIMAL_128, + TypeSig.all), + ParamCheck("finish", + TypeSig.commonCudfTypes + TypeSig.DECIMAL_128, + TypeSig.all))), + (in, conf, p, r) => new GpuArrayAggregateMeta(in, conf, p, r)), // TODO: fix the signature https://github.com/NVIDIA/spark-rapids/issues/5327 expr[ArraysZip]( "Returns a merged array of structs in which the N-th struct contains" + diff --git a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/higherOrderFunctions.scala b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/higherOrderFunctions.scala index 22e21c3b125..7daea466da6 100644 --- a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/higherOrderFunctions.scala +++ b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/higherOrderFunctions.scala @@ -26,9 +26,9 @@ import com.nvidia.spark.rapids.jni.GpuMapZipWithUtils import com.nvidia.spark.rapids.shims.ShimExpression import org.apache.spark.sql.catalyst.analysis.TypeCoercion -import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeReference, AttributeSeq, Expression, ExprId, NamedExpression} +import org.apache.spark.sql.catalyst.expressions.{Add, And, ArrayAggregate, Attribute, AttributeReference, AttributeSeq, CaseWhen, Cast, Expression, ExprId, Greatest, If, LambdaFunction, Least, Literal, Multiply, NamedExpression, NamedLambdaVariable, Or} import org.apache.spark.sql.internal.SQLConf -import org.apache.spark.sql.types.{ArrayType, BooleanType, DataType, MapType, Metadata, StructField, StructType} +import org.apache.spark.sql.types.{ArrayType, BooleanType, ByteType, DataType, Decimal, DecimalType, DoubleType, FloatType, IntegerType, LongType, MapType, Metadata, NumericType, ShortType, StructField, StructType} import org.apache.spark.sql.vectorized.ColumnarBatch /** @@ -213,6 +213,12 @@ trait GpuSimpleHigherOrderFunction extends GpuHigherOrderFunction with GpuBind { } +/** + * Common explode + lambda projection plumbing for higher-order functions over arrays. + * Subclasses choose how to consume the lambda's per-element result by either extending + * GpuArrayElementWiseTransform (one row in -> one row out via transformListColumnView) or + * implementing columnarEval themselves (e.g. segmented reductions like GpuArrayAggregate). + */ trait GpuArrayTransformBase extends GpuSimpleHigherOrderFunction { def isBound: Boolean def boundIntermediate: Seq[GpuExpression] @@ -222,7 +228,7 @@ trait GpuArrayTransformBase extends GpuSimpleHigherOrderFunction { boundIntermediate.map(_.dataType) ++ lambdaFunction.arguments.map(_.dataType) } - private[this] def makeElementProjectBatch( + protected def makeElementProjectBatch( inputBatch: ColumnarBatch, argColumn: GpuColumnVector): ColumnarBatch = { assert(argColumn.getBase.getType.equals(DType.LIST)) @@ -275,6 +281,14 @@ trait GpuArrayTransformBase extends GpuSimpleHigherOrderFunction { } } +} + +/** + * Specialization for HOFs that produce one output row per input row by post-processing the + * lambda's elementwise result. Subclasses implement transformListColumnView and inherit the + * standard columnarEval that drives the explode -> lambda eval -> rewrap chain. + */ +trait GpuArrayElementWiseTransform extends GpuArrayTransformBase { /* * Post-process the column view of the array after applying the function parameter. * @param lambdaTransformedCV the results of the lambda expression running @@ -303,7 +317,7 @@ case class GpuArrayTransform( argument: Expression, function: Expression, isBound: Boolean = false, - boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayTransformBase { + boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayElementWiseTransform { override def dataType: ArrayType = ArrayType(function.dataType, function.nullable) @@ -326,7 +340,7 @@ case class GpuArrayExists( function: Expression, followThreeValuedLogic: Boolean, isBound: Boolean = false, - boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayTransformBase { + boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayElementWiseTransform { override def dataType: DataType = BooleanType @@ -424,7 +438,7 @@ case class GpuArrayFilter( argument: Expression, function: Expression, isBound: Boolean = false, - boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayTransformBase { + boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayElementWiseTransform { override def dataType: DataType = argument.dataType @@ -895,3 +909,623 @@ case class GpuMapFilter(argument: Expression, } } } + + +// Registered segmented reductions used by GpuArrayAggregate. To add a new op: define the +// case object, wire matchBinary to its Catalyst shape, and append it to +// ArrayAggregateDecomposer.allOps. + +sealed trait AggOp { + def name: String + def cudfAgg: cudf.SegmentedReductionAggregation + def nullPolicy: cudf.NullPolicy + /** Identity scalar typed to match `t` so ifElse / binaryOp don't hit width mismatch. */ + def identityScalar(t: DataType): cudf.Scalar + /** + * Catalyst-side identity, used by the decomposer to plug into `If`/`CaseWhen` branches + * that are bare `acc` (treated as `op(acc, identity)` so the branch can be lifted out). + * Must satisfy `op(acc, identityLiteral(t)) == acc` for any acc of type `t`. + */ + def identityLiteral(t: DataType): Literal + /** + * `reduced OP zero`, typed to outDType, with Spark-matching null propagation. `zero` is + * a `BinaryOperable` so callers can pass either a `cudf.Scalar` (when the Spark-side + * `zero` is a Literal — saves one full-row column allocation per batch) or a `ColumnView` + * (when `zero` references an outer column). + */ + def combineWithZero( + reduced: cudf.ColumnVector, + zero: cudf.BinaryOperable, + outDType: DType): cudf.ColumnVector + /** Return (left, right) if the body is this op's Catalyst shape. */ + def matchBinary(body: Expression): Option[(Expression, Expression)] + def supportsType(sparkType: DataType): Boolean +} + +case object SumOp extends AggOp { + val name = "SUM" + def cudfAgg: cudf.SegmentedReductionAggregation = cudf.SegmentedReductionAggregation.sum() + // INCLUDE: Spark iteratively computes `acc + x` and null poisons the accumulator, so + // one null element anywhere in the list yields null. + val nullPolicy: cudf.NullPolicy = cudf.NullPolicy.INCLUDE + def identityScalar(t: DataType): cudf.Scalar = t match { + case ByteType => cudf.Scalar.fromByte(0.toByte) + case ShortType => cudf.Scalar.fromShort(0.toShort) + case IntegerType => cudf.Scalar.fromInt(0) + case LongType => cudf.Scalar.fromLong(0L) + case FloatType => cudf.Scalar.fromFloat(0.0f) + case DoubleType => cudf.Scalar.fromDouble(0.0) + case d: DecimalType => GpuScalar.from(0, d) + case other => throw new IllegalStateException(s"SUM identity not defined for $other") + } + // Each arm builds the value at exactly the right Scala type so Spark's Literal + // type-compatibility check (validateLiteralValue) doesn't reject e.g. Int into LongType. + def identityLiteral(t: DataType): Literal = t match { + case ByteType => Literal(0.toByte, ByteType) + case ShortType => Literal(0.toShort, ShortType) + case IntegerType => Literal(0, IntegerType) + case LongType => Literal(0L, LongType) + case FloatType => Literal(0.0f, FloatType) + case DoubleType => Literal(0.0, DoubleType) + case d: DecimalType => Literal(Decimal(0L, d.precision, d.scale), d) + case other => throw new IllegalStateException(s"SUM identity literal not defined for $other") + } + def combineWithZero(r: cudf.ColumnVector, z: cudf.BinaryOperable, out: DType) = r.add(z, out) + def matchBinary(e: Expression): Option[(Expression, Expression)] = e match { + case a: Add => Some((a.left, a.right)) + case _ => None + } + // Float/Double are gated behind `spark.rapids.sql.variableFloatAgg.enabled` (same conf + // as scalar GpuSum/GpuAverage) — cuDF's parallel tree-reduction sums in a different + // order than Spark's sequential left-fold, so the low-bit answer can differ even though + // both are valid IEEE 754 results. The check happens in GpuArrayAggregateMeta. + def supportsType(t: DataType): Boolean = t.isInstanceOf[NumericType] +} + +case object ProductOp extends AggOp { + val name = "PRODUCT" + def cudfAgg: cudf.SegmentedReductionAggregation = + cudf.SegmentedReductionAggregation.product() + val nullPolicy: cudf.NullPolicy = cudf.NullPolicy.INCLUDE + def identityScalar(t: DataType): cudf.Scalar = t match { + case ByteType => cudf.Scalar.fromByte(1.toByte) + case ShortType => cudf.Scalar.fromShort(1.toShort) + case IntegerType => cudf.Scalar.fromInt(1) + case LongType => cudf.Scalar.fromLong(1L) + case FloatType => cudf.Scalar.fromFloat(1.0f) + case DoubleType => cudf.Scalar.fromDouble(1.0) + case other => throw new IllegalStateException(s"PRODUCT identity not defined for $other") + } + def identityLiteral(t: DataType): Literal = t match { + case ByteType => Literal(1.toByte, ByteType) + case ShortType => Literal(1.toShort, ShortType) + case IntegerType => Literal(1, IntegerType) + case LongType => Literal(1L, LongType) + case FloatType => Literal(1.0f, FloatType) + case DoubleType => Literal(1.0, DoubleType) + case other => throw new IllegalStateException( + s"PRODUCT identity literal not defined for $other") + } + def combineWithZero(r: cudf.ColumnVector, z: cudf.BinaryOperable, out: DType) = r.mul(z, out) + def matchBinary(e: Expression): Option[(Expression, Expression)] = e match { + case m: Multiply => Some((m.left, m.right)) + case _ => None + } + // Float/Double gated by variableFloatAgg.enabled (see SumOp). Decimal would also need + // DecimalUtils.multiplyDecimals for overflow handling — out of scope, so PRODUCT + // excludes Decimal entirely. + def supportsType(t: DataType): Boolean = t match { + case _: NumericType => !t.isInstanceOf[DecimalType] + case _ => false + } +} + +/** + * MaxOp / MinOp share EXCLUDE null policy: Spark's Greatest / Least skip null operands. + * combineWithZero uses cuDF's NULL_MAX / NULL_MIN (the same primitive GpuGreatest/GpuLeast + * use), which returns the non-null operand when one side is null — exactly Spark's + * behavior on integral types. + * + * Float / Double are unsupported: cuDF's segmented `max` / `min` follow IEEE 754, where + * `fmax(NaN, x) = x` (NaN is absorbed). Spark's `Greatest` / `Least` use `Double.compare`, + * which treats NaN as larger than every other value and propagates it. Until we add an + * explicit NaN-propagation step, restrict to integral types. + */ +sealed trait ExtremumOp extends AggOp { + val nullPolicy: cudf.NullPolicy = cudf.NullPolicy.EXCLUDE + def binaryOp: cudf.BinaryOp + def combineWithZero(r: cudf.ColumnVector, z: cudf.BinaryOperable, out: DType) + : cudf.ColumnVector = r.binaryOp(binaryOp, z, out) + def supportsType(t: DataType): Boolean = t match { + case ByteType | ShortType | IntegerType | LongType => true + case _ => false + } +} + +case object MaxOp extends ExtremumOp { + val name = "MAX" + def cudfAgg: cudf.SegmentedReductionAggregation = cudf.SegmentedReductionAggregation.max() + val binaryOp: cudf.BinaryOp = cudf.BinaryOp.NULL_MAX + def identityScalar(t: DataType): cudf.Scalar = t match { + case ByteType => cudf.Scalar.fromByte(Byte.MinValue) + case ShortType => cudf.Scalar.fromShort(Short.MinValue) + case IntegerType => cudf.Scalar.fromInt(Int.MinValue) + case LongType => cudf.Scalar.fromLong(Long.MinValue) + case other => throw new IllegalStateException(s"MAX identity not defined for $other") + } + def identityLiteral(t: DataType): Literal = t match { + case ByteType => Literal(Byte.MinValue, ByteType) + case ShortType => Literal(Short.MinValue, ShortType) + case IntegerType => Literal(Int.MinValue, IntegerType) + case LongType => Literal(Long.MinValue, LongType) + case other => throw new IllegalStateException(s"MAX identity literal not defined for $other") + } + def matchBinary(e: Expression): Option[(Expression, Expression)] = e match { + case g: Greatest if g.children.size == 2 => Some((g.children.head, g.children(1))) + case _ => None + } +} + +case object MinOp extends ExtremumOp { + val name = "MIN" + def cudfAgg: cudf.SegmentedReductionAggregation = cudf.SegmentedReductionAggregation.min() + val binaryOp: cudf.BinaryOp = cudf.BinaryOp.NULL_MIN + def identityScalar(t: DataType): cudf.Scalar = t match { + case ByteType => cudf.Scalar.fromByte(Byte.MaxValue) + case ShortType => cudf.Scalar.fromShort(Short.MaxValue) + case IntegerType => cudf.Scalar.fromInt(Int.MaxValue) + case LongType => cudf.Scalar.fromLong(Long.MaxValue) + case other => throw new IllegalStateException(s"MIN identity not defined for $other") + } + def identityLiteral(t: DataType): Literal = t match { + case ByteType => Literal(Byte.MaxValue, ByteType) + case ShortType => Literal(Short.MaxValue, ShortType) + case IntegerType => Literal(Int.MaxValue, IntegerType) + case LongType => Literal(Long.MaxValue, LongType) + case other => throw new IllegalStateException(s"MIN identity literal not defined for $other") + } + def matchBinary(e: Expression): Option[(Expression, Expression)] = e match { + case l: Least if l.children.size == 2 => Some((l.children.head, l.children(1))) + case _ => None + } +} + +case object AllOp extends AggOp { + val name = "ALL" + def cudfAgg: cudf.SegmentedReductionAggregation = cudf.SegmentedReductionAggregation.all() + // INCLUDE: matches Spark's 3VL for AND (null AND true = null, null AND false = false). + val nullPolicy: cudf.NullPolicy = cudf.NullPolicy.INCLUDE + def identityScalar(t: DataType): cudf.Scalar = cudf.Scalar.fromBool(true) + def identityLiteral(t: DataType): Literal = Literal(true, BooleanType) + def combineWithZero(r: cudf.ColumnVector, z: cudf.BinaryOperable, out: DType) = r.and(z, out) + def matchBinary(e: Expression): Option[(Expression, Expression)] = e match { + case a: And => Some((a.left, a.right)) + case _ => None + } + def supportsType(t: DataType): Boolean = t == BooleanType +} + +case object AnyOp extends AggOp { + val name = "ANY" + def cudfAgg: cudf.SegmentedReductionAggregation = cudf.SegmentedReductionAggregation.any() + val nullPolicy: cudf.NullPolicy = cudf.NullPolicy.INCLUDE + def identityScalar(t: DataType): cudf.Scalar = cudf.Scalar.fromBool(false) + def identityLiteral(t: DataType): Literal = Literal(false, BooleanType) + def combineWithZero(r: cudf.ColumnVector, z: cudf.BinaryOperable, out: DType) = r.or(z, out) + def matchBinary(e: Expression): Option[(Expression, Expression)] = e match { + case o: Or => Some((o.left, o.right)) + case _ => None + } + def supportsType(t: DataType): Boolean = t == BooleanType +} + + +/** + * Result of successfully matching a Spark ArrayAggregate's merge lambda against a + * registered AggOp. + * + * @param op the chosen aggregation operator + * @param g the lifted Catalyst sub-expression for `g(x)`. For a plain + * `op(acc, g(x))` body this is the body's non-acc child; for an + * `If` / `CaseWhen` body it is rebuilt with bare-acc branches + * replaced by `op.identityLiteral` so it never references acc + * @param accVarExprId the accumulator NamedLambdaVariable's exprId + * @param elemVar the element NamedLambdaVariable (used to build the g lambda) + */ +case class ArrayAggregateDecomposition( + op: AggOp, + g: Expression, + accVarExprId: ExprId, + elemVar: NamedLambdaVariable) + + +/** + * Decomposes a Spark ArrayAggregate's merge lambda of shape `(acc, x) -> op(acc, g(x))` + * where `op` is one of the registered AggOps and the finish lambda is identity. + * + * decompose owns every reason ArrayAggregate cannot run on the GPU — shape, type, and + * nullability — so the meta layer is just a single Either match. + */ +object ArrayAggregateDecomposer { + + /** All ops the decomposer will try, in order. */ + val allOps: Seq[AggOp] = Seq(SumOp, ProductOp, MaxOp, MinOp, AllOp, AnyOp) + + def decompose( + merge: Expression, + finish: Expression, + argType: DataType, + zeroType: DataType): Either[String, ArrayAggregateDecomposition] = { + val mergeLambda = merge match { + case lf: LambdaFunction => lf + case _ => return Left("merge expression is not a LambdaFunction") + } + val (accVar, elemVar) = mergeLambda.arguments match { + case Seq(a: NamedLambdaVariable, e: NamedLambdaVariable) => (a, e) + case _ => return Left("merge lambda must take exactly 2 NamedLambdaVariable arguments") + } + if (!isFinishIdentity(finish)) { + return Left("finish lambda is not an identity (only `acc -> acc` is supported)") + } + + val body = mergeLambda.function + val accId = accVar.exprId + val matched = allOps.view.flatMap { op => + extractG(body, accId, op, zeroType).map { case (g, gIsLeft) => (op, g, gIsLeft) } + }.headOption + + val (op, g, _) = matched.getOrElse { + return Left("merge body does not match (acc, x) -> op(acc, g(x)) for any registered " + + "op (" + allOps.map(_.name).mkString(", ") + "); If / CaseWhen branches must each " + + "be op-of-acc with acc on a consistent side") + } + + if (!op.supportsType(zeroType)) { + return Left(s"${op.name} is not supported on GPU for type $zeroType") + } + // g's output type must equal the accumulator/zero type so the segmented reduce output + // matches the Spark-expected result type directly. + if (!DataType.equalsStructurally(g.dataType, zeroType, ignoreNullability = true)) { + return Left(s"g(x) output type (${g.dataType}) does not match accumulator/zero type " + + s"($zeroType)") + } + // cuDF's segmented ALL/ANY with INCLUDE nulls doesn't match Spark's AND/OR 3VL + // (specifically: `false AND null = false` short-circuit, or `true OR null = true`, are + // both missed by cuDF which returns null whenever any null is present). Fall back to + // CPU when the input array can contain nulls. + if (op == AllOp || op == AnyOp) { + argType match { + case ArrayType(_, true) => + return Left(s"${op.name} is not supported on GPU for arrays that may contain " + + "nulls; cuDF's INCLUDE-nulls semantics don't match Spark's AND/OR 3VL") + case _ => + } + } + + Right(ArrayAggregateDecomposition(op, g, accId, elemVar)) + } + + /** + * Try to extract g from the merge body, given a candidate op. Returns + * `Some((g, gIsLeft))` on success — `gIsLeft` is internal bookkeeping used by + * `alignSides` to ensure all branches of an If/CaseWhen agree on which side acc lives; + * it is not exposed in the final Decomposition (the lifted g already encodes the + * answer structurally). + * + * Three accepted shapes: + * 1. body is `op(acc, g)` or `op(g, acc)` — direct match. + * 2. body is `If(cond, t, f)` where each of `t`, `f` is itself accepted by this + * function (recursively) with `acc` on the *same* side, and `cond` doesn't + * reference `acc`. Lifted to `op(acc, If(cond, g_t, g_f))` (or the symmetric form + * with acc on the left). + * 3. body is `CaseWhen(branches, Some(else))` — generalised If for N branches. + * + * Bare `acc` in a branch is treated as `op(acc, identityLiteral)` — that's how we + * support the `If(cond, acc + 1, acc)` form: the right branch is bare acc, replaced + * with `acc + 0`, then the whole If lifts out as `acc + If(cond, 1, 0)`. + */ + private def extractG( + body: Expression, + accId: ExprId, + op: AggOp, + accType: DataType): Option[(Expression, Boolean)] = { + matchOpOfAcc(body, accId, op).orElse(extractFromBranching(body, accId, op, accType)) + } + + /** body matches op directly: returns `(g, gIsLeft)` if body is `op(acc, g)` / `op(g, acc)`. */ + private def matchOpOfAcc( + e: Expression, + accId: ExprId, + op: AggOp): Option[(Expression, Boolean)] = { + op.matchBinary(e).flatMap { case (l, r) => + if (isAccRef(l, accId) && !containsAccRef(r, accId)) Some((r, false)) + else if (isAccRef(r, accId) && !containsAccRef(l, accId)) Some((l, true)) + else None + } + } + + /** + * Decompose a single If/CaseWhen branch. Either it's an op-of-acc form (returns the + * non-acc side and whether acc was on the left), or it's a bare acc-ref (returns the + * op's identity literal, gIsLeft=false — the placeholder side doesn't matter, we only + * need branches to agree on it later). + * + * Recursively delegates to `extractG` so nested If is handled. + */ + private def extractBranch( + branch: Expression, + accId: ExprId, + op: AggOp, + accType: DataType): Option[(Expression, Boolean)] = { + if (isAccRef(branch, accId)) { + Some((op.identityLiteral(accType), /* gIsLeft = */ false)) + } else { + extractG(branch, accId, op, accType) + } + } + + private def extractFromBranching( + body: Expression, + accId: ExprId, + op: AggOp, + accType: DataType): Option[(Expression, Boolean)] = body match { + case If(cond, t, f) if !containsAccRef(cond, accId) => + for { + (tG, tIsLeft) <- extractBranch(t, accId, op, accType) + (fG, fIsLeft) <- extractBranch(f, accId, op, accType) + // The "bare acc" case picks gIsLeft=false. If a branch is bare acc, accept either + // side from the other branch — we'll just rebuild to that side. + gIsLeft <- alignSides(t, f, tIsLeft, fIsLeft, accId) + } yield (If(cond, tG, fG), gIsLeft) + + case CaseWhen(branches, Some(elseValue)) + if branches.forall { case (c, _) => !containsAccRef(c, accId) } => + // Decompose every (cond, val) branch + the else branch. All op-of-acc branches must + // agree on which side acc is on; bare-acc branches don't constrain. + val branchDecs = branches.map { case (c, v) => (c, extractBranch(v, accId, op, accType)) } + val elseDec = extractBranch(elseValue, accId, op, accType) + if (branchDecs.exists(_._2.isEmpty) || elseDec.isEmpty) { + None + } else { + val allBranchExprs: Seq[Expression] = branches.map(_._2) :+ elseValue + val allSides: Seq[Boolean] = branchDecs.map(_._2.get._2) :+ elseDec.get._2 + val constrainedSides = allBranchExprs.zip(allSides).collect { + case (br, side) if !isAccRef(br, accId) => side + } + if (constrainedSides.distinct.size > 1) { + None + } else { + val gIsLeft = constrainedSides.headOption.getOrElse(false) + val gBranches = branchDecs.map { case (c, dec) => (c, dec.get._1) } + Some((CaseWhen(gBranches, Some(elseDec.get._1)), gIsLeft)) + } + } + + case _ => None + } + + /** + * Reconcile the gIsLeft flags from two If branches. Bare-acc branches don't constrain + * (their identity placeholder is symmetric), so this is `agree if both constrained, + * else borrow from the constrained one`. + */ + private def alignSides( + tBranch: Expression, + fBranch: Expression, + tIsLeft: Boolean, + fIsLeft: Boolean, + accId: ExprId): Option[Boolean] = { + val tBare = isAccRef(tBranch, accId) + val fBare = isAccRef(fBranch, accId) + (tBare, fBare) match { + case (true, true) => Some(false) // both bare acc — fold has no actual op to apply + case (true, false) => Some(fIsLeft) + case (false, true) => Some(tIsLeft) + case (false, false) => if (tIsLeft == fIsLeft) Some(tIsLeft) else None + } + } + + private def isFinishIdentity(finish: Expression): Boolean = finish match { + case LambdaFunction(body, Seq(accVar: NamedLambdaVariable), _) => + isAccRef(body, accVar.exprId) + case _ => false + } + + private def isAccRef(e: Expression, id: ExprId): Boolean = e match { + case v: NamedLambdaVariable => v.exprId == id + case c: Cast => isAccRef(c.child, id) + case _ => false + } + + private def containsAccRef(e: Expression, id: ExprId): Boolean = e.exists { + case v: NamedLambdaVariable if v.exprId == id => true + case _ => false + } +} + + +/** + * GPU implementation of ArrayAggregate for lambdas decomposable via ArrayAggregateDecomposer. + * Runtime steps: + * 1. Evaluate g(x) over the array children (reusing GpuArrayTransformBase's explode path). + * 2. Rewrap as list with the original offsets and validity. + * 3. cuDF segmented reduce with the op's null policy. + * 4. Substitute op's identity into rows where reduce returned null due to "no elements + * contributed" (the exact condition depends on null policy; see `substituteMask`). + * 5. Combine with zero: `result = reduced OP zero`. + */ +case class GpuArrayAggregate( + argument: Expression, + zero: Expression, + function: Expression, + op: AggOp, + isBound: Boolean = false, + boundIntermediate: Seq[GpuExpression] = Seq.empty) extends GpuArrayTransformBase { + + override def dataType: DataType = zero.dataType + + // Matches Spark's ArrayAggregate.nullable = argument.nullable || finish.nullable. The + // finish lambda's accumulator variable is bound with nullable=true (Spark's + // ArrayAggregate.bind uses `zero.dataType -> true` for the acc slot), so the CPU side + // is effectively always true. Also covers the INCLUDE-policy case where a null element + // in a non-null list poisons the reduce and yields a null output row. + override def nullable: Boolean = true + + override def prettyName: String = "array_aggregate" + + override def children: Seq[Expression] = argument :: zero :: function :: Nil + + override def bind(input: AttributeSeq): GpuExpression = { + val (boundFunc, boundArg, boundInter) = bindLambdaFunc(input) + val boundZero = GpuBindReferences.bindGpuReferenceNoMetrics(zero, input) + GpuArrayAggregate(boundArg, boundZero, boundFunc, op, isBound = true, boundInter) + } + + /** + * Mask of rows where the reduce result must be replaced with the op's identity. + * + * INCLUDE ops (SUM/PRODUCT/ALL/ANY): only empty-and-not-null lists. Null-poisoned + * reduces stay null and propagate through the combine step, matching Spark's iterative + * `acc op null = null` semantics. + * + * EXCLUDE ops (MAX/MIN): any reduce-null over a non-null list — covers both empty lists + * and all-null lists, matching Spark's Greatest/Least which skip nulls entirely. + */ + private def substituteMask( + listCol: cudf.ColumnView, + reduced: cudf.ColumnVector): cudf.ColumnVector = { + val reducedIsEmpty = op.nullPolicy match { + case cudf.NullPolicy.INCLUDE => + // Empty-and-not-null only. Null-poisoned reduces stay null to match Spark's + // iterative `acc op null = null` semantics. + withResource(listCol.countElements()) { counts => + withResource(cudf.Scalar.fromInt(0)) { zero => + counts.equalTo(zero) + } + } + case cudf.NullPolicy.EXCLUDE => + // Any reduce-null: empty list OR all-null list (both mean "no element contributed"), + // matching Spark's Greatest/Least which skip nulls. + reduced.isNull + } + // Exclude null-list rows from the mask so the final null-restoration step handles them. + // Skip when the input list has no nulls — `m.and(all-true)` is a wasted kernel. + if (listCol.getNullCount > 0) { + withResource(reducedIsEmpty) { m => + withResource(listCol.isNotNull) { isNotNull => m.and(isNotNull) } + } + } else { + reducedIsEmpty + } + } + + override def columnarEval(batch: ColumnarBatch): GpuColumnVector = { + val outDType = GpuColumnVector.getNonNestedRapidsType(dataType) + withResource(argument.asInstanceOf[GpuExpression].columnarEval(batch)) { arg => + // Step 1: g(x) over children + segmented reduce. + val reduced: cudf.ColumnVector = + withResource(makeElementProjectBatch(batch, arg)) { cb => + withResource(function.asInstanceOf[GpuExpression].columnarEval(cb)) { + transformedData => + withResource(GpuListUtils.replaceListDataColumnAsView( + arg.getBase, transformedData.getBase)) { listOfGView => + listOfGView.listReduce(op.cudfAgg, op.nullPolicy, outDType) + } + } + } + + // Step 2: substitute op's identity for rows the reduce couldn't cover. + val adjusted: cudf.ColumnVector = withResource(reduced) { reduced => + withResource(substituteMask(arg.getBase, reduced)) { mask => + withResource(op.identityScalar(dataType)) { idScalar => + mask.ifElse(idScalar, reduced) + } + } + } + + // Step 3: combine with zero. When `zero` is a Literal (the common 4-arg + // `aggregate(arr, 0, ...)` shape) skip the per-batch column broadcast and pass a + // cudf.Scalar instead — `add/mul/and/or/binaryOp` all accept BinaryOperable. + val combined: cudf.ColumnVector = withResource(adjusted) { adjusted => + zero match { + case lit: GpuLiteral => + withResource(GpuScalar.from(lit.value, lit.dataType)) { zeroScalar => + op.combineWithZero(adjusted, zeroScalar, outDType) + } + case _ => + withResource(zero.asInstanceOf[GpuExpression].columnarEval(batch)) { zeroCv => + op.combineWithZero(adjusted, zeroCv.getBase, outDType) + } + } + } + + // Step 4: restore null on rows where the input list itself was null. cuDF NULL_MAX / + // NULL_MIN / LOGICAL_AND / LOGICAL_OR don't propagate null the way Spark's 3VL would, + // so the combine step alone can't preserve it. Skip outright when the list has no nulls. + if (arg.getBase.getNullCount > 0) { + withResource(combined) { combined => + GpuColumnVector.from(NullUtilities.mergeNulls(combined, arg.getBase), dataType) + } + } else { + GpuColumnVector.from(combined, dataType) + } + } + } +} + + +/** + * Expression-level meta for Spark's ArrayAggregate. Accepts lambdas that + * ArrayAggregateDecomposer can decompose into one of the registered AggOps with an + * identity finish; falls back to CPU otherwise. + */ +class GpuArrayAggregateMeta( + expr: ArrayAggregate, + conf: RapidsConf, + parent: Option[RapidsMeta[_, _, _]], + rule: DataFromReplacementRule) + extends ExprMeta[ArrayAggregate](expr, conf, parent, rule) { + + private var decomposition: Option[ArrayAggregateDecomposition] = None + + override def tagExprForGpu(): Unit = { + ArrayAggregateDecomposer.decompose( + expr.merge, expr.finish, expr.argument.dataType, expr.zero.dataType) match { + case Left(reason) => willNotWorkOnGpu(reason) + case Right(d) => + // SUM/PRODUCT on Float/Double diverge between cuDF's parallel tree-reduction + // and Spark's sequential left-fold. Same conf gate as GpuSum / GpuAverage — + // willNotWorkOnGpu when variableFloatAgg.enabled=false. + if (d.op == SumOp || d.op == ProductOp) { + GpuOverrides.checkAndTagFloatAgg(expr.zero.dataType, this.conf, this) + } + decomposition = Some(d) + } + } + + override def convertToGpuImpl(): GpuExpression = { + val d = decomposition.getOrElse( + throw new IllegalStateException("tagExprForGpu must succeed before convertToGpu")) + + val argGpu = childExprs.head.convertToGpu() + val zeroGpu = childExprs(1).convertToGpu() + // The lifted g may have a different shape from any sub-tree of the original merge body + // (If/CaseWhen branches get rewritten and identity literals get inserted), so we can't + // pick it out of childExprs(2)'s meta tree by index. Wrap g as a fresh ExprMeta and let + // spark-rapids tag/convert it. Sub-expressions inherited from the original body get + // re-tagged here, but tag is idempotent and they were already proven GPU-compatible + // when the parent ArrayAggregate was tagged. + val gMeta = GpuOverrides.wrapExpr(d.g, this.conf, Some(this)) + gMeta.tagForGpu() + if (!gMeta.canThisBeReplaced) { + throw new IllegalStateException( + s"could not convert g sub-expression ${d.g} to GPU: ${gMeta.explain(all = false)}") + } + val gGpu = gMeta.convertToGpu() + val elemVarGpu = GpuNamedLambdaVariable( + d.elemVar.name, d.elemVar.dataType, d.elemVar.nullable, d.elemVar.exprId) + val gLambda = GpuLambdaFunction(gGpu, Seq(elemVarGpu)) + + GpuArrayAggregate(argGpu, zeroGpu, gLambda, d.op) + } +} diff --git a/tests/src/test/scala/com/nvidia/spark/rapids/ArrayAggregateDecomposerSuite.scala b/tests/src/test/scala/com/nvidia/spark/rapids/ArrayAggregateDecomposerSuite.scala new file mode 100644 index 00000000000..69032fda86f --- /dev/null +++ b/tests/src/test/scala/com/nvidia/spark/rapids/ArrayAggregateDecomposerSuite.scala @@ -0,0 +1,314 @@ +/* + * Copyright (c) 2026, NVIDIA CORPORATION. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package com.nvidia.spark.rapids + +import org.apache.spark.sql.catalyst.expressions.{Add, And, CaseWhen, Cast, Divide, EqualTo, + Expression, GreaterThan, Greatest, If, LambdaFunction, Least, Literal, Multiply, + NamedExpression, NamedLambdaVariable, Or, Subtract} +import org.apache.spark.sql.types.{ArrayType, BooleanType, DataType, DoubleType, IntegerType, + LongType} + +// Extends GpuUnitTests so SQLConf.get is available for the default evalMode / failOnError +// parameter on Add/Subtract/Multiply/Divide (the field name differs across Spark versions; +// letting Spark apply its own default keeps this test shim-agnostic). +class ArrayAggregateDecomposerSuite extends GpuUnitTests { + import ArrayAggregateDecomposer.decompose + + private def lv(name: String, dt: DataType = IntegerType): NamedLambdaVariable = + NamedLambdaVariable(name, dt, nullable = true, exprId = NamedExpression.newExprId) + + private def merge( + body: Expression, + acc: NamedLambdaVariable, + x: NamedLambdaVariable): LambdaFunction = + LambdaFunction(body, Seq(acc, x)) + + private def identityFinish(acc: NamedLambdaVariable): LambdaFunction = + LambdaFunction(acc, Seq(acc)) + + /** Wrap zeroType in an ArrayType(_, containsNull = false) for the typical happy path. */ + private def arrTy(zeroType: DataType): ArrayType = ArrayType(zeroType, containsNull = false) + + private def assertDecomposes( + body: Expression, + acc: NamedLambdaVariable, + x: NamedLambdaVariable, + expectedOp: AggOp, + expectedG: Option[Expression] = None, + zeroType: DataType = IntegerType, + argType: Option[DataType] = None): ArrayAggregateDecomposition = { + val d = decompose(merge(body, acc, x), identityFinish(acc), + argType.getOrElse(arrTy(zeroType)), zeroType) + val r = d.getOrElse(fail(s"expected decomposition for body=$body, got Left: $d")) + assert(r.op == expectedOp) + expectedG.foreach { g => + assert(r.g.fastEquals(g), s"expected g=$g, got ${r.g}") + } + assert(r.accVarExprId == acc.exprId) + assert(r.elemVar.exprId == x.exprId) + r + } + + private def assertRejects( + mergeBody: LambdaFunction, + finish: Expression, + reason: String, + zeroType: DataType = IntegerType, + argType: Option[DataType] = None): String = { + val d = decompose(mergeBody, finish, argType.getOrElse(arrTy(zeroType)), zeroType) + assert(d.isLeft, s"$reason — expected Left but got: $d") + d.swap.getOrElse(fail("unreachable")) + } + + test("Add(acc, x) -> SUM, g on the right") { + val acc = lv("acc"); val x = lv("x") + assertDecomposes(Add(acc, x), acc, x, SumOp) + } + + test("Add(x, acc) (commuted) -> SUM, g on the left") { + val acc = lv("acc"); val x = lv("x") + assertDecomposes(Add(x, acc), acc, x, SumOp) + } + + test("Multiply(acc, x) -> PRODUCT") { + val acc = lv("acc"); val x = lv("x") + assertDecomposes(Multiply(acc, x), acc, x, ProductOp) + } + + test("Greatest(acc, x) -> MAX") { + val acc = lv("acc"); val x = lv("x") + assertDecomposes(Greatest(Seq(acc, x)), acc, x, MaxOp) + } + + test("Least(acc, x) -> MIN") { + val acc = lv("acc"); val x = lv("x") + assertDecomposes(Least(Seq(acc, x)), acc, x, MinOp) + } + + test("And(acc, x) -> ALL") { + val acc = lv("acc", BooleanType); val x = lv("x", BooleanType) + assertDecomposes(And(acc, x), acc, x, AllOp, + zeroType = BooleanType) + } + + test("Or(acc, x) -> ANY") { + val acc = lv("acc", BooleanType); val x = lv("x", BooleanType) + assertDecomposes(Or(acc, x), acc, x, AnyOp, + zeroType = BooleanType) + } + + test("Complex g(x) with no acc ref decomposes (g on the right)") { + val acc = lv("acc", LongType); val x = lv("x", IntegerType) + val g = Cast(Add(Multiply(x, Literal(2)), Literal(1)), LongType) + assertDecomposes(Add(acc, g), acc, x, SumOp, zeroType = LongType) + } + + test("Cast wrapping the acc side is unwrapped (single layer)") { + val acc = lv("acc", LongType); val x = lv("x", IntegerType) + assertDecomposes(Add(Cast(acc, IntegerType), x), acc, x, SumOp) + } + + test("Cast wrapping the acc side is unwrapped (chained)") { + val acc = lv("acc"); val x = lv("x") + val doubleCastAcc = Cast(Cast(acc, LongType), IntegerType) + assertDecomposes(Add(doubleCastAcc, x), acc, x, SumOp) + } + + test("Subtract is not an associative op we recognize") { + val acc = lv("acc"); val x = lv("x") + assertRejects(merge(Subtract(acc, x), acc, x), identityFinish(acc), + "Subtract is not in the registered AggOps") + } + + test("Divide is not an associative op we recognize") { + val acc = lv("acc"); val x = lv("x") + assertRejects(merge(Divide(acc, x), acc, x), identityFinish(acc), + "Divide is not in the registered AggOps") + } + + test("Greatest with arity != 2 is not decomposed") { + val acc = lv("acc"); val x = lv("x") + val body = Greatest(Seq(acc, x, Literal(1))) + assertRejects(merge(body, acc, x), identityFinish(acc), + "Greatest with 3 children is not a 2-operand op") + } + + test("g that references acc is rejected") { + val acc = lv("acc"); val x = lv("x") + assertRejects(merge(Add(acc, Multiply(acc, x)), acc, x), identityFinish(acc), + "g must not reference acc") + } + + test("both sides reference acc is rejected") { + val acc = lv("acc"); val x = lv("x") + assertRejects(merge(Add(acc, acc), acc, x), identityFinish(acc), + "neither side is a 'pure non-acc'") + } + + test("neither side is a pure acc ref is rejected") { + val acc = lv("acc"); val x = lv("x") + assertRejects(merge(Add(Add(acc, Literal(1)), x), acc, x), identityFinish(acc), + "left side isn't a naked acc ref") + } + + test("non-identity finish is rejected") { + val acc = lv("acc"); val x = lv("x") + val finishAcc = lv("finishAcc") + val badFinish = LambdaFunction(Add(finishAcc, Literal(1)), Seq(finishAcc)) + assertRejects(merge(Add(acc, x), acc, x), badFinish, + "finish that multiplies the accumulator isn't identity") + } + + test("finish referencing a different variable id is rejected") { + val acc = lv("acc"); val x = lv("x") + val finishAcc = lv("finishAcc") + val otherVar = lv("other") + val badFinish = LambdaFunction(otherVar, Seq(finishAcc)) + assertRejects(merge(Add(acc, x), acc, x), badFinish, + "finish body refers to a variable that isn't its own arg") + } + + test("merge with wrong arg count is rejected") { + val acc = lv("acc"); val x = lv("x"); val extra = lv("extra") + val body = LambdaFunction(Add(acc, x), Seq(acc, x, extra)) + assertRejects(body, identityFinish(acc), "merge must take 2 lambda args") + } + + test("merge that isn't a LambdaFunction at all is rejected") { + val acc = lv("acc") + assert(decompose(Add(Literal(1), Literal(2)), identityFinish(acc), + arrTy(IntegerType), IntegerType).isLeft) + } + + test("finish that isn't a LambdaFunction is rejected") { + val acc = lv("acc"); val x = lv("x") + assert(decompose(merge(Add(acc, x), acc, x), Literal(0), + arrTy(IntegerType), IntegerType).isLeft) + } + + // The decomposer now owns the "is this shape ever GPU-able" decision, so it must also + // reject unsupported types and AllOp/AnyOp on null-bearing arrays. + + test("MaxOp on Double is rejected (NaN propagation differs from cuDF)") { + val acc = lv("acc", DoubleType); val x = lv("x", DoubleType) + val msg = assertRejects(merge(Greatest(Seq(acc, x)), acc, x), identityFinish(acc), + "MAX should fall back on Double", + zeroType = DoubleType) + assert(msg.contains("MAX"), s"expected MAX-related error, got: $msg") + } + + test("ALL on array with containsNull rejects") { + val acc = lv("acc", BooleanType); val x = lv("x", BooleanType) + val msg = assertRejects(merge(And(acc, x), acc, x), identityFinish(acc), + "ALL on null-bearing array should fall back", + zeroType = BooleanType, + argType = Some(ArrayType(BooleanType, containsNull = true))) + assert(msg.contains("ALL"), s"expected ALL-related error, got: $msg") + } + + test("g type mismatch with zero type rejects") { + val acc = lv("acc", LongType); val x = lv("x", IntegerType) + // body sums a non-cast Int element into a Long acc — g.dataType=Int doesn't match + // zeroType=Long, so this must fall back even though the shape is otherwise OK. + val msg = assertRejects(merge(Add(acc, x), acc, x), identityFinish(acc), + "g type mismatch should fall back", + zeroType = LongType) + assert(msg.contains("does not match"), s"expected type-mismatch error, got: $msg") + } + + // If / CaseWhen normalize: branches that are op-of-acc (or bare acc treated as + // op(acc, identity)) get lifted out so cond-driven count-if patterns run on the GPU. + + test("If(cond, acc + t, acc) decomposes to SUM (revans's pattern)") { + val acc = lv("acc"); val x = lv("x") + val body = If(EqualTo(x, Literal(7)), Add(acc, Literal(1)), acc) + assertDecomposes(body, acc, x, SumOp) + } + + test("If(cond, acc, acc + t) decomposes to SUM (commuted branches)") { + val acc = lv("acc"); val x = lv("x") + val body = If(EqualTo(x, Literal(7)), acc, Add(acc, Literal(1))) + assertDecomposes(body, acc, x, SumOp) + } + + test("If(cond, acc + t1, acc + t2) — both branches op-of-acc — decomposes") { + val acc = lv("acc"); val x = lv("x") + val body = If(GreaterThan(x, Literal(0)), Add(acc, x), Add(acc, Literal(0))) + assertDecomposes(body, acc, x, SumOp) + } + + test("If with MAX (greatest(acc, x)) on one branch and bare acc on the other") { + val acc = lv("acc"); val x = lv("x") + val body = If(GreaterThan(x, Literal(0)), Greatest(Seq(acc, x)), acc) + assertDecomposes(body, acc, x, MaxOp) + } + + test("If with And on boolean acc decomposes to ALL") { + val acc = lv("acc", BooleanType); val x = lv("x", BooleanType) + val body = If(EqualTo(x, Literal(true)), And(acc, x), acc) + assertDecomposes(body, acc, x, AllOp, + zeroType = BooleanType) + } + + test("CaseWhen with multiple acc+t branches and acc else decomposes") { + val acc = lv("acc"); val x = lv("x") + val body = CaseWhen( + Seq( + (EqualTo(x, Literal(1)), Add(acc, Literal(10))), + (EqualTo(x, Literal(2)), Add(acc, Literal(20)))), + Some(acc)) + assertDecomposes(body, acc, x, SumOp) + } + + test("If condition references acc — rejected (g must not depend on acc)") { + val acc = lv("acc"); val x = lv("x") + val body = If(GreaterThan(acc, Literal(100)), Add(acc, Literal(1)), acc) + assertRejects(merge(body, acc, x), identityFinish(acc), + "cond referencing acc breaks per-element parallelism") + } + + test("If branches use different ops — rejected") { + val acc = lv("acc"); val x = lv("x") + val body = If(GreaterThan(x, Literal(0)), Add(acc, Literal(1)), Multiply(acc, Literal(2))) + assertRejects(merge(body, acc, x), identityFinish(acc), + "branches mixing Add and Multiply have no single op to lift") + } + + test("If branches put acc on different sides — rejected") { + val acc = lv("acc"); val x = lv("x") + val body = If(GreaterThan(x, Literal(0)), Add(acc, x), Add(x, acc)) + assertRejects(merge(body, acc, x), identityFinish(acc), + "branches with acc on different sides can't share a single lifted form") + } + + test("CaseWhen without else — rejected") { + val acc = lv("acc"); val x = lv("x") + val body = CaseWhen( + Seq((EqualTo(x, Literal(1)), Add(acc, Literal(10)))), + None) + assertRejects(merge(body, acc, x), identityFinish(acc), + "CaseWhen with no else has implicit null fallthrough we don't model") + } + + test("Nested If is decomposed recursively") { + val acc = lv("acc"); val x = lv("x") + // if(c1, if(c2, acc + 1, acc + 2), acc) + val inner = If(GreaterThan(x, Literal(10)), Add(acc, Literal(1)), Add(acc, Literal(2))) + val outer = If(GreaterThan(x, Literal(0)), inner, acc) + assertDecomposes(outer, acc, x, SumOp) + } +} diff --git a/tools/generated_files/330/operatorsScore.csv b/tools/generated_files/330/operatorsScore.csv index 0708fa821bc..b3794066a01 100644 --- a/tools/generated_files/330/operatorsScore.csv +++ b/tools/generated_files/330/operatorsScore.csv @@ -50,6 +50,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/330/supportedExprs.csv b/tools/generated_files/330/supportedExprs.csv index 77b62bcecd8..1aae119c31c 100644 --- a/tools/generated_files/330/supportedExprs.csv +++ b/tools/generated_files/330/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/331/operatorsScore.csv b/tools/generated_files/331/operatorsScore.csv index a8f66bbf1e2..d2abe472782 100644 --- a/tools/generated_files/331/operatorsScore.csv +++ b/tools/generated_files/331/operatorsScore.csv @@ -50,6 +50,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/331/supportedExprs.csv b/tools/generated_files/331/supportedExprs.csv index c7913cbb3d9..a501a1714cc 100644 --- a/tools/generated_files/331/supportedExprs.csv +++ b/tools/generated_files/331/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/332/operatorsScore.csv b/tools/generated_files/332/operatorsScore.csv index a8f66bbf1e2..d2abe472782 100644 --- a/tools/generated_files/332/operatorsScore.csv +++ b/tools/generated_files/332/operatorsScore.csv @@ -50,6 +50,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/332/supportedExprs.csv b/tools/generated_files/332/supportedExprs.csv index c7913cbb3d9..a501a1714cc 100644 --- a/tools/generated_files/332/supportedExprs.csv +++ b/tools/generated_files/332/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/333/operatorsScore.csv b/tools/generated_files/333/operatorsScore.csv index a8f66bbf1e2..d2abe472782 100644 --- a/tools/generated_files/333/operatorsScore.csv +++ b/tools/generated_files/333/operatorsScore.csv @@ -50,6 +50,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/333/supportedExprs.csv b/tools/generated_files/333/supportedExprs.csv index c7913cbb3d9..a501a1714cc 100644 --- a/tools/generated_files/333/supportedExprs.csv +++ b/tools/generated_files/333/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/334/operatorsScore.csv b/tools/generated_files/334/operatorsScore.csv index a8f66bbf1e2..d2abe472782 100644 --- a/tools/generated_files/334/operatorsScore.csv +++ b/tools/generated_files/334/operatorsScore.csv @@ -50,6 +50,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/334/supportedExprs.csv b/tools/generated_files/334/supportedExprs.csv index c7913cbb3d9..a501a1714cc 100644 --- a/tools/generated_files/334/supportedExprs.csv +++ b/tools/generated_files/334/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/340/operatorsScore.csv b/tools/generated_files/340/operatorsScore.csv index 0eb2ed2c9d1..adc973373e9 100644 --- a/tools/generated_files/340/operatorsScore.csv +++ b/tools/generated_files/340/operatorsScore.csv @@ -51,6 +51,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/340/supportedExprs.csv b/tools/generated_files/340/supportedExprs.csv index 482270f0d0e..b72deede8ed 100644 --- a/tools/generated_files/340/supportedExprs.csv +++ b/tools/generated_files/340/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/341/operatorsScore.csv b/tools/generated_files/341/operatorsScore.csv index 0eb2ed2c9d1..adc973373e9 100644 --- a/tools/generated_files/341/operatorsScore.csv +++ b/tools/generated_files/341/operatorsScore.csv @@ -51,6 +51,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/341/supportedExprs.csv b/tools/generated_files/341/supportedExprs.csv index 482270f0d0e..b72deede8ed 100644 --- a/tools/generated_files/341/supportedExprs.csv +++ b/tools/generated_files/341/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/342/operatorsScore.csv b/tools/generated_files/342/operatorsScore.csv index 0eb2ed2c9d1..adc973373e9 100644 --- a/tools/generated_files/342/operatorsScore.csv +++ b/tools/generated_files/342/operatorsScore.csv @@ -51,6 +51,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/342/supportedExprs.csv b/tools/generated_files/342/supportedExprs.csv index 482270f0d0e..b72deede8ed 100644 --- a/tools/generated_files/342/supportedExprs.csv +++ b/tools/generated_files/342/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/343/operatorsScore.csv b/tools/generated_files/343/operatorsScore.csv index 0eb2ed2c9d1..adc973373e9 100644 --- a/tools/generated_files/343/operatorsScore.csv +++ b/tools/generated_files/343/operatorsScore.csv @@ -51,6 +51,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/343/supportedExprs.csv b/tools/generated_files/343/supportedExprs.csv index 482270f0d0e..b72deede8ed 100644 --- a/tools/generated_files/343/supportedExprs.csv +++ b/tools/generated_files/343/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/344/operatorsScore.csv b/tools/generated_files/344/operatorsScore.csv index 0eb2ed2c9d1..adc973373e9 100644 --- a/tools/generated_files/344/operatorsScore.csv +++ b/tools/generated_files/344/operatorsScore.csv @@ -51,6 +51,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/344/supportedExprs.csv b/tools/generated_files/344/supportedExprs.csv index 482270f0d0e..b72deede8ed 100644 --- a/tools/generated_files/344/supportedExprs.csv +++ b/tools/generated_files/344/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/350/operatorsScore.csv b/tools/generated_files/350/operatorsScore.csv index adb17e0c312..2d3f273c462 100644 --- a/tools/generated_files/350/operatorsScore.csv +++ b/tools/generated_files/350/operatorsScore.csv @@ -58,6 +58,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/350/supportedExprs.csv b/tools/generated_files/350/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/350/supportedExprs.csv +++ b/tools/generated_files/350/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/351/operatorsScore.csv b/tools/generated_files/351/operatorsScore.csv index adb17e0c312..2d3f273c462 100644 --- a/tools/generated_files/351/operatorsScore.csv +++ b/tools/generated_files/351/operatorsScore.csv @@ -58,6 +58,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/351/supportedExprs.csv b/tools/generated_files/351/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/351/supportedExprs.csv +++ b/tools/generated_files/351/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/352/operatorsScore.csv b/tools/generated_files/352/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/352/operatorsScore.csv +++ b/tools/generated_files/352/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/352/supportedExprs.csv b/tools/generated_files/352/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/352/supportedExprs.csv +++ b/tools/generated_files/352/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/353/operatorsScore.csv b/tools/generated_files/353/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/353/operatorsScore.csv +++ b/tools/generated_files/353/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/353/supportedExprs.csv b/tools/generated_files/353/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/353/supportedExprs.csv +++ b/tools/generated_files/353/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/354/operatorsScore.csv b/tools/generated_files/354/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/354/operatorsScore.csv +++ b/tools/generated_files/354/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/354/supportedExprs.csv b/tools/generated_files/354/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/354/supportedExprs.csv +++ b/tools/generated_files/354/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/355/operatorsScore.csv b/tools/generated_files/355/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/355/operatorsScore.csv +++ b/tools/generated_files/355/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/355/supportedExprs.csv b/tools/generated_files/355/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/355/supportedExprs.csv +++ b/tools/generated_files/355/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/356/operatorsScore.csv b/tools/generated_files/356/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/356/operatorsScore.csv +++ b/tools/generated_files/356/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/356/supportedExprs.csv b/tools/generated_files/356/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/356/supportedExprs.csv +++ b/tools/generated_files/356/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/357/operatorsScore.csv b/tools/generated_files/357/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/357/operatorsScore.csv +++ b/tools/generated_files/357/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/357/supportedExprs.csv b/tools/generated_files/357/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/357/supportedExprs.csv +++ b/tools/generated_files/357/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/358/operatorsScore.csv b/tools/generated_files/358/operatorsScore.csv index 823d05ea694..06525f664cb 100644 --- a/tools/generated_files/358/operatorsScore.csv +++ b/tools/generated_files/358/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/358/supportedExprs.csv b/tools/generated_files/358/supportedExprs.csv index 6f70ef61b39..aad208ec635 100644 --- a/tools/generated_files/358/supportedExprs.csv +++ b/tools/generated_files/358/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/400/operatorsScore.csv b/tools/generated_files/400/operatorsScore.csv index b232ffeb8ed..89a2c26d3ae 100644 --- a/tools/generated_files/400/operatorsScore.csv +++ b/tools/generated_files/400/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/400/supportedExprs.csv b/tools/generated_files/400/supportedExprs.csv index 8cbd9dfe053..49a7efa1dd9 100644 --- a/tools/generated_files/400/supportedExprs.csv +++ b/tools/generated_files/400/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/401/operatorsScore.csv b/tools/generated_files/401/operatorsScore.csv index 2c7d4847e8a..c98bf4f3614 100644 --- a/tools/generated_files/401/operatorsScore.csv +++ b/tools/generated_files/401/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/401/supportedExprs.csv b/tools/generated_files/401/supportedExprs.csv index 701fc837634..b7f349c0da7 100644 --- a/tools/generated_files/401/supportedExprs.csv +++ b/tools/generated_files/401/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/402/operatorsScore.csv b/tools/generated_files/402/operatorsScore.csv index 2c7d4847e8a..c98bf4f3614 100644 --- a/tools/generated_files/402/operatorsScore.csv +++ b/tools/generated_files/402/operatorsScore.csv @@ -59,6 +59,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/402/supportedExprs.csv b/tools/generated_files/402/supportedExprs.csv index 701fc837634..b7f349c0da7 100644 --- a/tools/generated_files/402/supportedExprs.csv +++ b/tools/generated_files/402/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/411/operatorsScore.csv b/tools/generated_files/411/operatorsScore.csv index ed0aaad355b..87b8a403879 100644 --- a/tools/generated_files/411/operatorsScore.csv +++ b/tools/generated_files/411/operatorsScore.csv @@ -60,6 +60,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/411/supportedExprs.csv b/tools/generated_files/411/supportedExprs.csv index 7451a7072c3..d8e1a3e607b 100644 --- a/tools/generated_files/411/supportedExprs.csv +++ b/tools/generated_files/411/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`; `reduce`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`; `reduce`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA diff --git a/tools/generated_files/operatorsScore.csv b/tools/generated_files/operatorsScore.csv index 0708fa821bc..b3794066a01 100644 --- a/tools/generated_files/operatorsScore.csv +++ b/tools/generated_files/operatorsScore.csv @@ -50,6 +50,7 @@ AggregateExpression,4 Alias,4 And,4 ApproximatePercentile,4 +ArrayAggregate,4 ArrayContains,4 ArrayDistinct,4 ArrayExcept,4 diff --git a/tools/generated_files/supportedExprs.csv b/tools/generated_files/supportedExprs.csv index 77b62bcecd8..1aae119c31c 100644 --- a/tools/generated_files/supportedExprs.csv +++ b/tools/generated_files/supportedExprs.csv @@ -27,6 +27,11 @@ And,S,`and`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N And,S,`and`,None,AST,lhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,rhs,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA And,S,`and`,None,AST,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA +ArrayAggregate,S,`aggregate`,None,project,zero,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,finish,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,merge,S,S,S,S,S,S,S,S,PS,S,S,NS,NS,NS,NS,NS,NS,NS,NS,NS +ArrayAggregate,S,`aggregate`,None,project,argument,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA ArrayContains,S,`array_contains`,None,project,key,S,S,S,S,S,S,S,S,PS,S,NS,NS,NS,NS,NS,NS,NS,NS,NS,NS ArrayContains,S,`array_contains`,None,project,result,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA