Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[None][fix] AutoDeploy: dive deeper into token generation bugs AutoDeploy <NV> AutoDeploy Backend
#8108 opened Oct 1, 2025 by lucaslie Draft
1 task done
Draft: [None][fix] fix: updating patchelf version
#8105 opened Sep 30, 2025 by pcastonguay Loading…
1 task
[doc] Add Qwen3 Next Guide to Core README Community want to contribute PRs initiated from Community
#8101 opened Sep 30, 2025 by faradawn Loading…
1 task
[None][fix] Avoid unnecessary concat in attn_output_gate case.
#8094 opened Sep 30, 2025 by yuxianq Loading…
1 task done
[None][fix] Disable DeepGEMM for Qwen3 MoE Attention layers
#8087 opened Sep 30, 2025 by achartier Loading…
1 task done
[None][feat] add RocketKV support (experimental)
#8086 opened Sep 30, 2025 by lfr-0531 Loading…
1 task
[None][fix] Add Lock to protect mReqeustToSession
#8085 opened Sep 30, 2025 by chuangz0 Loading…
1 task done
[None][fix] Enable FP8 ContextMLA on GB300
#8080 opened Sep 30, 2025 by longlee0622 Loading…
1 task done
[None][autodeploy] small refactors on attention matching
#8079 opened Sep 30, 2025 by Fridah-nv Loading…
1 task done
test gb200
#8074 opened Sep 29, 2025 by yuanjingx87 Draft
1 task
[#7312][feat] Torch.compile for transformers mode
#8073 opened Sep 29, 2025 by h-guo18 Draft
1 task
ProTip! Type g i on any issue or pull request to go back to the issue listing page.