Skip to content

Forward-merge release/26.04 into main#1975

Open
rapids-bot[bot] wants to merge 4 commits intomainfrom
release/26.04
Open

Forward-merge release/26.04 into main#1975
rapids-bot[bot] wants to merge 4 commits intomainfrom
release/26.04

Conversation

@rapids-bot
Copy link
Copy Markdown

@rapids-bot rapids-bot bot commented Mar 31, 2026

Forward-merge triggered by push to release/26.04 that creates a PR to keep main up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

I ran across a few typos and link formatting issues (`.rst` vs. `.md` syntax) while reading through the docs recently. This PR contains fixes for those and a few other typos that claude found.

Authors:
  - James Bourbeau (https://github.com/jrbourbeau)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1973
@rapids-bot
Copy link
Copy Markdown
Author

rapids-bot bot commented Mar 31, 2026

FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

Merge after #1880

This PR adds support for streaming out of core (dataset on host) kmeans clustering. The idea is simple:

Batched accumulation of centroid updates: Data is processed in batches and batch-wise means and cluster counts are accumulated until all the batches i.e., the full dataset pass has completed.
This PR just brings a batch-size parameter to load and compute cluster assignments and (weighted) centroid adjustments on batches of the dataset. The final centroid 'updates' i.e. a single kmeans iteration only completes when all these accumulated sums are averaged once the whole dataset pass has completed.

Authors:
  - Tarang Jain (https://github.com/tarang-jain)

Approvers:
  - Victor Lafargue (https://github.com/viclafargue)
  - Anupam (https://github.com/aamijar)
  - Micka (https://github.com/lowener)
  - Jinsol Park (https://github.com/jinsolp)
  - Ben Frederickson (https://github.com/benfred)

URL: #1886
@rapids-bot rapids-bot bot requested review from a team as code owners March 31, 2026 23:46
divyegala and others added 2 commits April 1, 2026 00:08
This is to ensure the config does not accidentally get corrupted or start with garbage values. Furthermore, the CUDA [docs](https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__EXECUTION.html#group__CUDART__EXECUTION_1ge236ecdbbaf7cf47a806bba71c1d03c4) recommend setting `config.attrs = nullptr` even if `config.numAttrs = 0`.

The need of this update comes from rapidsai/cuml#7906, where in cuML nightly wheel tests we are intermittently observing CUDA context corruption from the JIT path. While I am not sure if this PR will resolve them, it is still a step in the right direction.

Authors:
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Dante Gama Dessavre (https://github.com/dantegd)

URL: #1974
- Add sync_stream for D2H copies of the graph
- strided copy to copy the strided dataset

Authors:
  - Tarang Jain (https://github.com/tarang-jain)

Approvers:
  - Ben Karsin (https://github.com/bkarsin)
  - Jinsol Park (https://github.com/jinsolp)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #1966
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants