-
- Notifications
You must be signed in to change notification settings - Fork 11.7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[For Testing][Do not merge] Re-enable hybrid-kv cache for llama4 by default llama Related to Llama models
#23859 opened Aug 28, 2025 by LucasWilkinson • Draft
5 tasks
[DO NOT MERGE] Experiments related to MoE kernels
#26860 opened Oct 15, 2025 by zhuohan123 • Draft
5 tasks
[WIP][torch.compile] Add Triton-distributed GEMM+AllReduce fusion compile pass
#27281 opened Oct 21, 2025 by jasonlizhengjian • Draft
5 tasks
Remove all operator overrides for batch invariance
#28149 opened Nov 5, 2025 by PaulZhang12 • Draft
5 tasks
[DBO] Compile non-dbo cudagraphs for shapes that are close to dbo_decode_token_threshold nvidia v1
#27771 opened Oct 29, 2025 by SageMoore Loading…
[wip] Fix prime rl test ci/build ready ONLY add when PR is ready to merge/full CI is needed
#28062 opened Nov 4, 2025 by rzabarazesh Loading…
5 tasks
Fix .coveragerc to use absolute source path
#25516 opened Sep 23, 2025 by rzabarazesh • Draft
5 tasks
Refactor/understanding prepare inputs padded speculative-decoding v1
#25758 opened Sep 26, 2025 by tomasruizt • Draft
[Rocm] [quickreduce] optimize quickduce for mi 350_355 rocm Related to AMD ROCm
#25239 opened Sep 19, 2025 by haoyangli-amd • Draft
Previous Next
ProTip! Updated in the last three days: updated:>2025-12-01.