Conversation
Codecov ReportBase: 83.63% // Head: 83.78% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1337 +/- ##
==========================================
+ Coverage 83.63% 83.78% +0.14%
==========================================
Files 373 373
Lines 40284 40392 +108
==========================================
+ Hits 33692 33841 +149
+ Misses 6592 6551 -41
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
|
I noticed a significant performance regression on reading utf8 data. I think it is related to now defaulting to a |
| | State::OptionalDictionary(_, _) | ||
| | State::OptionalDelta(_, _) | ||
| | State::FilteredOptionalDelta(_, _) => ( | ||
| Binary::<O>::with_capacity(capacity, 0), |
There was a problem hiding this comment.
This value capacity of 0 is very costly in most cases.
| | State::RequiredDictionary(_) | ||
| | State::Delta(_) | ||
| | State::FilteredDelta(_) => ( | ||
| Binary::<O>::with_capacity(capacity, 0), |
There was a problem hiding this comment.
This value capacity of 0 is very costly in most cases.
| let values = SizedBinaryIter::new(&dict.buffer, dict.num_values); | ||
|
|
||
| let mut data = Binary::<O>::with_capacity(dict.num_values); | ||
| let mut data = Binary::<O>::with_capacity(dict.num_values, 0); |
There was a problem hiding this comment.
This value capacity of 0 is very costly in most cases.
| impl<O: Offset> Binary<O> { | ||
| #[inline] | ||
| pub fn with_capacity(capacity: usize) -> Self { | ||
| pub fn with_capacity(capacity: usize, values_capacity: usize) -> Self { |
There was a problem hiding this comment.
Maybe we should have this signature:
pub fn with_capacity(capacity: usize, values_capacity: Option<usize>) -> Self {and then
let values_capacity = values_capacity.unwrap_or(capacity.min(100) * 24);There was a problem hiding this comment.
If the values_capacity is zero by default, maybe we could use a need_estimated to reserve the space.
We use this in databend
e8e04f7 to
ef34171
Compare
|
Sorry for the delay on this one - if anyone would like to take an extra pass go ahead, otherwise I will merge it in. |
Closes #1324