-
Notifications
You must be signed in to change notification settings - Fork 1k
[Variant]: Implement DataType::ListView/LargeListView
support for cast_to_variant
kernel
#8241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Looking at both this PR and the next one #8282, I think we can actually cover all five list types with just one list builder if we create a new extension trait: trait ListLikeArray: Array {
/// Get the values array
fn values(&self) -> &arrow::array::ArrayRef;
/// Get the start and end indices for a list element
fn element_range(&self, index: usize) -> Range<usize>;
} And then the builder takes a template type (Probably something similar could be done for the binary and string array variations as well, but I didn't look as closely there) |
Thanks for the advice, @scovich ! Let me refactor this PR |
782ad12
to
0ef8abb
Compare
…cast_to_variant` kernel
0ef8abb
to
65a2b30
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Any reason not to cover FixedSizeListArray
while we're at it? It should be just a few extra lines of code and a new unit test.
Also, you may want to update the PR description now that the PR is no longer stacked? |
Yes, it’s pretty straightforward, but I’d prefer to keep it separate since I originally opened two distinct issues. That way, each change can close its corresponding ticket cleanly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @liamzwbao -- I think there is just one more test needed and this one will be good to go
Thank you @scovich for the review
// Create a ListViewArray with some data | ||
let mut builder = ListViewBuilder::new(Int32Array::builder(0)); | ||
builder.append_value(&Int32Array::from(vec![Some(0), Some(1), Some(2)])); | ||
builder.append_value(&Int32Array::from(vec![Some(3), Some(4)])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add please add a test case that has a null as one of the list elements (in addition to entire slot that is null)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, variant array elements are non-nullable. So any (SQL) NULL must be converted to Variant::Null
before adding it to the array -- this is true regardless of shredding.
Note: It almost certainly makes sense to convert NULL values from an arrow array with nullable elements into Variant::Null
values when converting to variant. So the resulting code might be the same either way. But it might still be helpful to understand the distinction (code comments, unstated assumptions, etc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that converting an arrow ListArray with null elements should result in an Variant::List(...) where one of the elements is Variant::Null
I am not sure if you asking a question or just making a statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly a statement -- to make sure we're doing the right thing for the right reason rather than by accident.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked some variant spec folks a while back, and in their (parquet storage) world, arrays with SQL NULL elements are simply not a thing, and so variant code should not even need to think about it. For example:
it doesn't really make sense to talk about
an array with (SQL) nullable elements
. The array must already be a Variant array, and the Variant binary spec has no way to represent SQL NULL as an array element, only Variantnull
.
Meanwhile, IMO it's perfectly fine for arrow to take a position that NULL values in an arrow list array will produce Variant::NULL
values if converted to variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added cases where array element is NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @scovich and @liamzwbao
Which issue does this PR close?
DataType::ListView/LargeListView
support forcast_to_variant
kernel #8236.Rationale for this change
What changes are included in this PR?
Implement
ListView/LargeListView
for cast_to_variantAre these changes tested?
Yes
Are there any user-facing changes?
New cast type supported