-
- Notifications
You must be signed in to change notification settings - Fork 11.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CI/Build][AMD] Skip marlin, machete, and hadacore tests since these require _C functions not defined for ROCm
#30109 opened Dec 5, 2025 by rasmith Loading…
Add config flag for VLLM_DISABLE_COMPILE_CACHE
#30108 opened Dec 5, 2025 by elizabetht Loading…
3 of 5 tasks
[Kernel] Guard all NVFP4 op registrations in torch bindings ready ONLY add when PR is ready to merge/full CI is needed
#30103 opened Dec 5, 2025 by yeqcharlotte Loading…
5 tasks
Fix AWQ MoE marlin check issue in marlin_utils.py for AMD backend
#30102 opened Dec 5, 2025 by yuttian1 Loading…
[Bugfix] Use PIECEWISE cudagraph with gpt-oss on Ampere gpt-oss Related to GPT-OSS models nvidia
#30096 opened Dec 5, 2025 by bbrowning Loading…
lazy load vllm.utils.serial_utils import tensor2base64 to avoid break. multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
#30094 opened Dec 4, 2025 by QiliangCui Loading…
5 tasks
fix#30092 Kimi-Linear model loading failure with missing indexer_rotary_emb
#30093 opened Dec 4, 2025 by baonudesifeizhai Loading…
5 tasks
Rename TensorRT Model Optimizer to Model Optimizer documentation Improvements or additions to documentation
#30091 opened Dec 4, 2025 by Edwardf0t1 Loading…
5 tasks
fix: Force float16 dtype for GGUF models to fix incorrect output
#30090 opened Dec 4, 2025 by kitaekatt Loading…
[Model] Add support for transformer-based Ultravox v0.7 projector
#30089 opened Dec 4, 2025 by petersalas Loading…
5 tasks
[Tools] Standard Scripts For Model Bash
#30087 opened Dec 4, 2025 by robertgshaw2-redhat Loading…
5 tasks
Revert "[CI] Add Async Eplb nightly CI tests" ci/build
#30086 opened Dec 4, 2025 by SageMoore Loading…
[CI/Build] Only install DeepEP kernels on CUDA version 12.8+ ci/build nvidia
#30079 opened Dec 4, 2025 by NickCao Loading…
2 of 5 tasks
[CI] Have pre-commit comment on a PR if pre-commit was not used ci/build
#30077 opened Dec 4, 2025 by hmellor Loading…
[docker] Only install flashinfer-jit-cache on CUDA 12.8+ ci/build nvidia
#30073 opened Dec 4, 2025 by NickCao Loading…
3 of 5 tasks
[Core] Whisper enable
FULL_DECODE_ONLY CudaGraph nvidia v1 #30072 opened Dec 4, 2025 by NickLucche Loading…
[docker] Fix downloading sccache on aarch64 platform ci/build
#30070 opened Dec 4, 2025 by NickCao Loading…
3 of 5 tasks
[CPU][Perf] Add fast vectorized exp impl from Arm Optimized Routines
#30068 opened Dec 4, 2025 by Elm8116 Loading…
2 tasks
[Core] Add per-request speculative decoding statistics to RequestOutput v1
#30067 opened Dec 4, 2025 by jellysnack Loading…
5 tasks
Previous Next
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.