-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Draft: [None][fix] fix: updating patchelf version
#8105
opened Sep 30, 2025 by
pcastonguay
Loading…
1 task
[TRTLLM-5966][feat] Helix: add full MLA support for Helix
#8104
opened Sep 30, 2025 by
MatthiasKohl
Loading…
[doc] Add Qwen3 Next Guide to Core README
Community want to contribute
PRs initiated from Community
#8101
opened Sep 30, 2025 by
faradawn
Loading…
1 task
[https://nvbugs/5521949][fix] Fix head_size handling in ModelConfig.get_bindings_model_config
#8100
opened Sep 30, 2025 by
amitz-nv
Loading…
1 task
[#7588][feat] lock gpu clocks in test_perf.py to reliably detect perf regressions
#8099
opened Sep 30, 2025 by
MrGeva
Loading…
1 task done
[https://nvbugs/5541494] [fix] Fix missing sm100f/103a kernels and add tests
#8098
opened Sep 30, 2025 by
VALLIS-NERIA
Loading…
1 task
[None][feat] reuse cudagraph memory pool in normal forward flow
#8095
opened Sep 30, 2025 by
HuiGao-NV
Loading…
1 task
[None][fix] Avoid unnecessary concat in attn_output_gate case.
#8094
opened Sep 30, 2025 by
yuxianq
Loading…
1 task done
[#7588][fix] fixed the kv cache size parsing in test_perf.py AD backend
#8092
opened Sep 30, 2025 by
MrGeva
Loading…
1 task done
[TRTLLM-8246][test] add multimodal kvcache+chunked_prefil cases in to QA test list
#8091
opened Sep 30, 2025 by
crazydemo
Loading…
1 task done
[None][fix] Disable DeepGEMM for Qwen3 MoE Attention layers
#8087
opened Sep 30, 2025 by
achartier
Loading…
1 task done
[None][feat] add RocketKV support (experimental)
#8086
opened Sep 30, 2025 by
lfr-0531
Loading…
1 task
[None][fix] Add Lock to protect mReqeustToSession
#8085
opened Sep 30, 2025 by
chuangz0
Loading…
1 task done
[None][fix] Enable FP8 ContextMLA on GB300
#8080
opened Sep 30, 2025 by
longlee0622
Loading…
1 task done
[None][autodeploy] small refactors on attention matching
#8079
opened Sep 30, 2025 by
Fridah-nv
Loading…
1 task done
[https://nvbugs/5549111][fix] Fix 2-model overlap scheduler accuracy on very long prompts
#8076
opened Sep 29, 2025 by
mikeiovine
Loading…
1 task done
[None][fix] Fix Qwen3 FP8 per-tensor when requesting TRTLLM-GEN MoE backend
#8075
opened Sep 29, 2025 by
achartier
Loading…
1 task done
[https://nvbugs/5549081][fix] Fix device id assignment for some vision models
#8070
opened Sep 29, 2025 by
chang-l
Loading…
1 task done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.