-
Notifications
You must be signed in to change notification settings - Fork 28
Streaming Q4 implementation #710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
TomAugspurger
wants to merge
85
commits into
rapidsai:main
Choose a base branch
from
TomAugspurger:tom/streaming-q4
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
85 commits
Select commit
Hold shift + click to select a range
8671c48
TPCH-derived Q3
wence- c7736d3
Q1
wence- b4ef3fe
Parallel grouping
wence- 2e1fdf2
Dup the user's communicator when creating our MPI comm wrapper
wence- 32dcff7
Context creation and options parsing into utils
wence- e24cfc3
Use refactored context/argparse in q03
wence- 50a050a
And in q1
wence- f55ddef
Q9
wence- 25e60de
Docstring for main
wence- 195e71a
Docstring
wence- 012e9fb
Make broadcast public
wence- 83e73b1
Whack a load of stuff in
wence- ed8890a
Fix some bugs
wence- cec0ef0
Use utils in q3 too
wence- a7c17da
TODO
wence- 96a4884
WIP: bloom filter
wence- 9558a0a
Bloom filter updates
wence- d9a6046
Shuffle join option and bloom filter in q3
wence- 2fd1255
More stuff
wence- fafcce9
Bloom filter ranges
wence- 1dce429
Timing info?
wence- 04e09f5
More timings
wence- f8e483c
Now?
wence- c0cd406
Print time in logging output
wence- f74833c
Avoid an alloc
wence- c2a0bc3
More
wence- 5d66787
Propagate exceptions from parquet chunk read failures
wence- b42ca4e
Try merging bloom filters on device
wence- cf3d990
Remove debug
wence- eacf67a
Done
wence- aa6d9cc
Thread safety in parquet write
wence- 0e7f3fd
Fixes
wence- 7a14854
Adapt to upstream changes
wence- 1df954a
Fix timestamp types
wence- 48d8fd0
cmake format
wence- 89ef19c
Loop in cmake
wence- 39550ab
event_loop range only in verbose mode
wence- 78fdb14
Merge remote-tracking branch 'upstream/main' into wence/fea/q03
TomAugspurger 6c96b17
Avoid RAPIDSMPF_FUNC_RANGE macro in .cu file
TomAugspurger 74f36a8
Remove GNUism in RAPIDSMPF_NVTX_FUNC_RANGE
wence- d07605a
Merge remote-tracking branch 'upstream/main' into wence/fea/q03
wence- 4d5a6aa
Finalize MPI with RAII
wence- 483765b
cmake-format
wence- 09c0cc2
Fixes
wence- b81e2c2
More fixes
wence- c415379
Add docstring in cmake
wence- cc7e5e0
WIP: Streaming Q4 implementation
TomAugspurger 632b797
Merge branch 'wence/fea/q03' into tom/streaming-q4
TomAugspurger ff504f0
fixup! Merge branch 'wence/fea/q03' into tom/streaming-q4
TomAugspurger 07de58c
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger 2030807
Use groupby utilities
TomAugspurger 16be63b
fuse final groupby agg
TomAugspurger f51ef59
revert fuse
TomAugspurger a3a0413
Add binary
TomAugspurger b733c9f
fixup
TomAugspurger 7a2780a
fixup
TomAugspurger d013062
Use a bloom filter
TomAugspurger 8bcaed0
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger 5152b12
Note on why we shuffle
TomAugspurger ea64499
Streams, events, joins
TomAugspurger 49f78f9
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger a049351
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger 8098cd2
lint
TomAugspurger a75c762
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger 041c3e8
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger 6ff31f9
Compiling again
TomAugspurger 05782b3
remove duplicate log
TomAugspurger 4ae399c
remove unused event
TomAugspurger 3bea0df
fix while condition
TomAugspurger 672af5d
fixes
TomAugspurger 9c8a21a
revert MPI change
TomAugspurger 7c16d67
docstring fixes
TomAugspurger 2c29dc2
Merge remote-tracking branch 'upstream/main' into tom/streaming-q4
TomAugspurger 372851a
to_device compat
TomAugspurger 202b7f7
clarify hash-partitioning
TomAugspurger 206337c
add run-and-validate
TomAugspurger 17ed006
reuse chunkwise_group_by
TomAugspurger a9613e2
fixed queries parsing
TomAugspurger 7a135e2
Handle dates
TomAugspurger 026c4ec
reuse chunkwise_sort_by
TomAugspurger 52b3934
simplify
TomAugspurger c01a377
simplify
TomAugspurger a2d46be
static casts
TomAugspurger 4ddc853
Use KeepKeys::NO
TomAugspurger ec8b45c
fix loop
TomAugspurger File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you take
left_chunkby const ref, butright_chunkby rvalue reference (i.e. caller mustmoveit).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure (I'm reading up on the semantics of the two now).
We use this
streaming:TableChunk&&type for the other functions (inner_join_chunk) so I suspect I was trying to match that. But when doing a broadcast left semi join, theleft_chunkis reused many times, one per chunk, which I think means it can't be moved.