Skip to content

Conversation

@YangZhou1997
Copy link
Member

Description

Please include a summary of the changes and the related issue.

Fixes # (issue)

Type of Change

  • Bug fix
  • New feature
  • Documentation update

How Has This Been Tested?

Include any tests here.

  • Unit tests
  • Integration tests
  • Manual testing

Checklist

  • My code follows the style guidelines, e.g. format.sh.
  • I have run build_and_install.sh to verify compilation.
  • I have removed redundant variables and comments.
  • I have updated the documentation.
  • I have added tests.

zhongjiechen and others added 5 commits January 3, 2026 22:58
)

* Fix readme.

* Remove hardcode macro.

* WIP. IBRC for p2p.

* USE_QPEX.

* Nits.

* Fix qp init bug.

* IBRC now is working on new p2p arch.

* Format code.

* Inlining small metadata.

* Remove unused macro.

* Rename macro.

* Rename efa folder to rdma.

* Refactor p2p

* Format code.

* Nits.

* Nits.

* Fix Makefile.

* Nits.

* Nits.

* Nits.

* fix p2p on efa

* deprecate p2p chunk size and entropy, add UCCL_IB_GID_INDEX

* change kNICContextNumber to 4 to allow 4 EFA nics per GPU

* renaming a bit

---------

Co-authored-by: YangZhou1997 <yangzhou.rpc@gmail.com>
Co-authored-by: Ziming Mao <ziming.mao@berkeley.edu>
@YangZhou1997 YangZhou1997 changed the title config rdma [EP] config rdma Jan 6, 2026
// Add self-ranks, sub other ranks
if (thread_id < kNumRanks) {
atomicAdd_system(barrier_signal_ptrs[rank] + thread_id, FINISHED_SUM_TAG);
memory_fence();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the memory_fence() needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zhenhuang12 and others added 14 commits January 6, 2026 03:25
* add UdsExchanger for Ipc

* IpcEndpoint

* gpu_id->local_rank; rank->glocal_rank

* communcator test with multi processes

* format

* add ipc cache
* add vllm launch scripts

* pass on vLLM setup

* add tiemout

* fixing several issues

* nits

* nits

* nits

* refining readme

* nits

* add DG_JIT_CACHE_DIR

* nits

* debug vvlm launch

* dp=2, tp=8 works for ll mode

* continue debugging

* remove conditional

* remove print

* remove print

* newline

* add readme

* fomrat

---------

Co-authored-by: YangZhou1997 <yangzhou.rpc@gmail.com>
Co-authored-by: MaoZiming <ziming.mao@berkeley.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants