vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 11.8k
Star 64.7k

Code
Issues 1.9k
Pull requests 1.3k
Discussions
Actions
Projects 20
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 75 Milestones 3

New pull request New

Clear current search query, filters, and sorts

1,283 Open 15,828 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[CI/Build][AMD] Skip marlin, machete, and hadacore tests since these require _C functions not defined for ROCm

#30109 opened Dec 5, 2025 by rasmith

Loading…

Add config flag for VLLM_DISABLE_COMPILE_CACHE

#30108 opened Dec 5, 2025 by elizabetht

Loading…

3 of 5 tasks

Add more docs for regex

#30106 opened Dec 5, 2025 by xu-song

Loading…

1 of 5 tasks

[Kernel] Guard all NVFP4 op registrations in torch bindings ready

ONLY add when PR is ready to merge/full CI is needed

#30103 opened Dec 5, 2025 by yeqcharlotte

Loading…

5 tasks

Fix AWQ MoE marlin check issue in marlin_utils.py for AMD backend

#30102 opened Dec 5, 2025 by yuttian1

Loading…

[Feature] Batch invariant: Lora

#30097 opened Dec 5, 2025 by quanliu1991

Loading…

5 tasks

[Bugfix] Use PIECEWISE cudagraph with gpt-oss on Ampere gpt-oss

Related to GPT-OSS models

nvidia

#30096 opened Dec 5, 2025 by bbrowning

Loading…

lazy load vllm.utils.serial_utils import tensor2base64 to avoid break. multi-modality

Related to multi-modality (#4194)

ready

ONLY add when PR is ready to merge/full CI is needed

#30094 opened Dec 4, 2025 by QiliangCui

Loading…

5 tasks

fix#30092 Kimi-Linear model loading failure with missing indexer_rotary_emb

#30093 opened Dec 4, 2025 by baonudesifeizhai

Loading…

5 tasks

Rename TensorRT Model Optimizer to Model Optimizer documentation

Improvements or additions to documentation

#30091 opened Dec 4, 2025 by Edwardf0t1

Loading…

5 tasks

fix: Force float16 dtype for GGUF models to fix incorrect output

#30090 opened Dec 4, 2025 by kitaekatt

Loading…

[Model] Add support for transformer-based Ultravox v0.7 projector

#30089 opened Dec 4, 2025 by petersalas

Loading…

5 tasks

[Tools] Standard Scripts For Model Bash

#30087 opened Dec 4, 2025 by robertgshaw2-redhat

Loading…

5 tasks

Revert "[CI] Add Async Eplb nightly CI tests" ci/build

#30086 opened Dec 4, 2025 by SageMoore

Loading…

[CI/Build] Only install DeepEP kernels on CUDA version 12.8+ ci/build nvidia

#30079 opened Dec 4, 2025 by NickCao

Loading…

2 of 5 tasks

Feature/alpha moe integration

#30078 opened Dec 4, 2025 by vyom-hai

Loading…

3 of 5 tasks

[CI] Have pre-commit comment on a PR if pre-commit was not used ci/build

#30077 opened Dec 4, 2025 by hmellor

Loading…

[DO NOT MERGE] dummy PR ci/build

#30074 opened Dec 4, 2025 by junpuf

Loading…

5 tasks

[docker] Only install flashinfer-jit-cache on CUDA 12.8+ ci/build nvidia

#30073 opened Dec 4, 2025 by NickCao

Loading…

3 of 5 tasks

[Core] Whisper enable FULL_DECODE_ONLY CudaGraph nvidia v1

#30072 opened Dec 4, 2025 by NickLucche

Loading…

[Quantization] Support Quark int4-fp8 w4a8 for MoE

#30071 opened Dec 4, 2025 by BowenBao

Loading…

[docker] Fix downloading sccache on aarch64 platform ci/build

#30070 opened Dec 4, 2025 by NickCao

Loading…

3 of 5 tasks

[CPU][Perf] Add fast vectorized exp impl from Arm Optimized Routines

#30068 opened Dec 4, 2025 by Elm8116

Loading…

2 tasks

[Core] Add per-request speculative decoding statistics to RequestOutput v1

#30067 opened Dec 4, 2025 by jellysnack

Loading…

5 tasks

Mistral tool parser frontend tool-calling

#30063 opened Dec 4, 2025 by graelo

Loading…

3 of 5 tasks

Previous 1 2 3 4 5 … 51 52 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!