-
Notifications
You must be signed in to change notification settings - Fork 95
feat: add lambda expression support #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -37,6 +37,7 @@ public class ProtoExpressionConverter { | |
| private final Type.Struct rootType; | ||
| private final ProtoTypeConverter protoTypeConverter; | ||
| private final ProtoRelConverter protoRelConverter; | ||
| private final List<Type.Struct> lambdaParameterStack = new ArrayList<>(); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we use a stack here?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I need to access
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's fair. Java's stack does extend The only thing I will say is I do think that the whole stack parameter stuff can be confusing for people without a little bit of PL experience, which is why I pushed @Slimsammylim to encapsulate some of that logic and put it in docstring comments. I could see a case for encapsulating this logic in a local private class, though it probably makes more sense to do this only if this logic comes up again. I expected to see something like this logic elsewhere, as you need to do it to do validation when building lambdas. Though I didn't find anything like this. Does the builder for lambdas in this PR actually validate that the lambda is semantically correct?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that it would be nice for clarity to encapsulate this quite specific behaviour in a class (single responsibility principle). It doesn't need to extend one of the full collection interfaces, just provide the methods you need:
Or make it generic if you will reuse it for other types. The caller doesn't need to worry about clever indexing logic or what collection implementation is used internally; just make simple method calls that clearly describe the intent. |
||
|
|
||
| public ProtoExpressionConverter( | ||
| ExtensionLookup lookup, | ||
|
|
@@ -75,6 +76,26 @@ public FieldReference from(io.substrait.proto.Expression.FieldReference referenc | |
| reference.getDirectReference().getStructField().getField(), | ||
| rootType, | ||
| reference.getOuterReference().getStepsOut()); | ||
| case LAMBDA_PARAMETER_REFERENCE: | ||
| { | ||
| io.substrait.proto.Expression.FieldReference.LambdaParameterReference lambdaParamRef = | ||
| reference.getLambdaParameterReference(); | ||
|
|
||
| int stepsOut = lambdaParamRef.getStepsOut(); | ||
| int lambdaIndex = lambdaParameterStack.size() - 1 - stepsOut; | ||
| if (lambdaIndex < 0 || lambdaIndex >= lambdaParameterStack.size()) { | ||
| throw new IllegalArgumentException( | ||
| String.format( | ||
| "Lambda parameter reference with stepsOut=%d is invalid (current depth: %d)", | ||
| stepsOut, lambdaParameterStack.size())); | ||
| } | ||
|
Comment on lines
+85
to
+91
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This logic could sit within a simple class to encapsulate the lambda parameter stack behaviour (as described in other comments) instead of adding extra complexity to this method. |
||
|
|
||
| Type.Struct lambdaParameters = lambdaParameterStack.get(lambdaIndex); | ||
| return FieldReference.newLambdaParameterReference( | ||
| reference.getDirectReference().getStructField().getField(), | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Something I'm curious about is why we use I'm unfamiliar with substrait-java, but does this current implementation support nested field access through lambda parameters? *This may be an unimportant issue especially if it's not common to access nested fields in lambdas |
||
| lambdaParameters, | ||
| stepsOut); | ||
| } | ||
| case ROOTTYPE_NOT_SET: | ||
| default: | ||
| throw new IllegalArgumentException("Unhandled type: " + reference.getRootTypeCase()); | ||
|
|
@@ -260,6 +281,27 @@ public Type visit(Type.Struct type) throws RuntimeException { | |
| } | ||
| } | ||
|
|
||
| case LAMBDA: | ||
| { | ||
| io.substrait.proto.Expression.Lambda protoLambda = expr.getLambda(); | ||
| Type.Struct parameters = | ||
| (Type.Struct) | ||
| protoTypeConverter.from( | ||
| io.substrait.proto.Type.newBuilder() | ||
| .setStruct(protoLambda.getParameters()) | ||
| .build()); | ||
|
|
||
| lambdaParameterStack.add(parameters); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. then we can use the standard stack |
||
|
|
||
| Expression body; | ||
| try { | ||
| body = from(protoLambda.getBody()); | ||
| } finally { | ||
| lambdaParameterStack.remove(lambdaParameterStack.size() - 1); | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and the standard stack |
||
| } | ||
|
|
||
| return Expression.Lambda.builder().parameters(parameters).body(body).build(); | ||
| } | ||
| // TODO enum. | ||
| case ENUM: | ||
| throw new UnsupportedOperationException("Unsupported type: " + expr.getRexTypeCase()); | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is setting nullability false correct here? I am pretty sure
funccan be nullable, e.g.func?<i32 -> i32>.Also, probably will be good to make a test captures this then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the substrait-go implementation: the GetType function for lambda returns Nullability: types.NullabilityRequired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm I am thinking about this right now but that may have been wrong in
substrait-go🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a spec error. I raised an issue in upstream here.
Mind leaving a comment there referencing this issue as a TODO so we remember to fix it once it gets resolved?