-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[None][fix] Replace assertions with warnings for unsupported logits/logprobs in speculative sampler
Community want to contribute
PRs initiated from Community
#12547
opened Mar 25, 2026 by
yifjiang
Loading…
3 tasks
[None][feat] Add production-level Prometheus metrics (iteration stats, config info, token counters, phase histograms)
Community want to contribute
PRs initiated from Community
#12545
opened Mar 25, 2026 by
nvyutwu
Loading…
5 tasks
[None][feat] Enable NVFP4 KV cache support in trtllm-gen attention
#12544
opened Mar 25, 2026 by
yihwang-nv
Loading…
1 task
[TRTLLMINF-37][feat] Add CI agent failure analysis to L0_MergeRequest…
#12543
opened Mar 25, 2026 by
dpitman-nvda
Loading…
1 task done
[https://nvbugs/6018172][fix] Add synchronization calls to warmup when host cache offloading is active
#12539
opened Mar 25, 2026 by
longlee0622
•
Draft
1 task
[TRTLLM-11318][feat] move VisualGen APIs to a separate dir
VisualGen
#12538
opened Mar 25, 2026 by
zhenhuaw-me
Loading…
1 task done
[None][feat] Add Mamba2 MTP SSM cache CUDA kernel for tree-based speculative decoding
#12537
opened Mar 25, 2026 by
JadoTu
Loading…
1 task done
[None][test] Enhance performance tests by adding GPU availability check in test_perf.py
#12535
opened Mar 25, 2026 by
yufeiwu-nv
Loading…
1 task done
[None][doc] Add MoE developer guide for fused_moe module
#12534
opened Mar 25, 2026 by
xxi-nv
Loading…
2 tasks done
[https://nvbugs/5989920][test] Unwaive DeepSeekV3 nvfp4 mtp3_fp8kv_chunked test
#12533
opened Mar 25, 2026 by
yizhang-nv
Loading…
1 task done
[None][docs] Add docstrings to cpp_custom_ops, model_config, and llm_args
#12532
opened Mar 25, 2026 by
longcheng-nv
Loading…
1 task done
[TRTLLM-10061][feat] Add support of linear attention state for C++ KV cache manager
#12531
opened Mar 25, 2026 by
VALLIS-NERIA
Loading…
2 tasks done
[https://nvbugs/5879577][fix] Fix KeyError in DeepSeekV3Lite FP8 MTP weight loading
#12530
opened Mar 25, 2026 by
sunnyqgg
Loading…
3 tasks done
[https://nvbugs/6007967][fix] fix disagg pp hang issue
#12528
opened Mar 25, 2026 by
bo-nv
Loading…
1 task
[https://nvbugs/6007197][fix] Adjust RocketKV test threshold
#12527
opened Mar 25, 2026 by
heyuhhh
Loading…
1 task done
[TRTLLM-11657][feat] Conversation affinity disagg router
#12526
opened Mar 25, 2026 by
reasonsolo
•
Draft
1 task
[None][feat] Disable shared paged index in flashinfer trtllm-gen fmha kernel and unify kv cache buffer calculation with thop.attention
#12525
opened Mar 25, 2026 by
yihwang-nv
Loading…
1 task done
[None][doc] Fix duplicate words in comments
#12524
opened Mar 25, 2026 by
YihuiLu512
Loading…
1 task done
[https://nvbugs/6011517][fix] Fix autotuner OOM for trtllmGen MoE runners at large context length
#12523
opened Mar 25, 2026 by
hyukn
Loading…
1 task done
Adds a LMCache v1 KV connector example (llm_lmcache_connector.py) that
Community want to contribute
PRs initiated from Community
#12522
opened Mar 25, 2026 by
feixiangpeng
Loading…
1 task
[None][fix] Fix _waiting_requests to use compute tokens with KV cache reuse
#12521
opened Mar 25, 2026 by
lancelly
Loading…
[None][test] Add different input-output of eagle cases on Spark
#12520
opened Mar 25, 2026 by
JennyLiu-nv
Loading…
1 task done
[None][feat] Add Blackwell MLA backend selection
#12519
opened Mar 25, 2026 by
bmarimuthu-nv
•
Draft
1 task
[#11992][fix] Support include_stop_token_in_output in gRPC request manager
Community want to contribute
PRs initiated from Community
#12517
opened Mar 24, 2026 by
CatherineSue
Loading…
3 tasks done
[https://nvbugs/6015329][fix] Use model-level warmup cache key for visual gen pipelines
VisualGen
#12516
opened Mar 24, 2026 by
karljang
Loading…
2 tasks done
Previous Next
ProTip!
no:milestone will show everything without a milestone.