vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 11.7k
Star 64.5k

Code
Issues 1.9k
Pull requests 1.3k
Discussions
Actions
Projects 20
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: vllm-project/vllm

Labels 74 Milestones 3

New pull request New

1,281 Open 15,692 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

[ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues ci/build multi-modality

Related to multi-modality (#4194)

rocm

Related to AMD ROCm

#29909 opened Dec 2, 2025 by AndreasKaratzas

Loading…

[BUGFIX] llama_4_scaling wrongly passed to DeepseekAttention deepseek

Related to DeepSeek models

llama

Related to Llama models

ready

ONLY add when PR is ready to merge/full CI is needed

#29908 opened Dec 2, 2025 by juliendenize

Loading…

5 tasks

[CI/Build] Avoid duplicate empty inputs test for common multimodal generation tests multi-modality

Related to multi-modality (#4194)

#29907 opened Dec 2, 2025 by Isotr0py • Draft

3 of 5 tasks

[Bugfix] Free requests to avoid a KV Cache exhaustion during VLLM_NIXL_ABORT_REQUEST_TIMEOUT kv-connector v1

#29906 opened Dec 2, 2025 by hasB4K

Loading…

Gigachat 3 tool parser and tests documentation

Improvements or additions to documentation

frontend tool-calling

#29905 opened Dec 2, 2025 by ajpqs

Loading…

4 of 5 tasks

[Logs] Optimize startup logs 4 nvidia v1

#29903 opened Dec 2, 2025 by yewentao256

Loading…

SigLIP example add chat_template documentation

Improvements or additions to documentation

needs-rebase

#29902 opened Dec 2, 2025 by piood

Loading…

[Kernel][Quantization][MoE] add marlin kernel support for turing (sm75) ci/build

#29901 opened Dec 2, 2025 by jinzhen-lin

Loading…

Add DGX Spark compatibility to Dockerfile ci/build

#29900 opened Dec 2, 2025 by ericcurtin

Loading…

[Core] Fix standalone runs of test_reset_prefix_cache_e2e v1

#29899 opened Dec 2, 2025 by markmc

Loading…

[Compile] Fix torch warning TensorFloat32 tensor cores for float32 matrix multiplication available but not enabled ready

ONLY add when PR is ready to merge/full CI is needed

#29897 opened Dec 2, 2025 by yewentao256

Loading…

feat(model): Add BitsAndBytes quantization support for Qwen3-Omni-MoE qwen

Related to Qwen models

ready

ONLY add when PR is ready to merge/full CI is needed

#29896 opened Dec 2, 2025 by Navanit-git

Loading…

5 tasks

[DOC] Add Arm to list of compute resouces providers documentation

Improvements or additions to documentation

#29894 opened Dec 2, 2025 by fadara01

Loading…

2 tasks

[BugFix] Fix assert in build_for_cudagraph_capture nvidia ready

ONLY add when PR is ready to merge/full CI is needed

#29893 opened Dec 2, 2025 by LucasWilkinson

Loading…

v0.12.0

[Bugfix] Fix FP8 MoE LoRA

#29890 opened Dec 2, 2025 by jeejeelee

Loading…

5 tasks

[Bugfix] respect user-defined flash attention version in ViT attentions

#29889 opened Dec 2, 2025 by cjackal

Loading…

3 tasks done

[ROCm][Perf] Enable shuffle kv cache layout and assembly paged attention kernel for AiterFlashAttentionBackend rocm

Related to AMD ROCm

#29887 opened Dec 2, 2025 by ganyi1996ppo

Loading…

5 tasks

optimize topk_topp_sampling. v1

#29886 opened Dec 2, 2025 by RanTao123

Loading…

4 tasks

DRAFT: Mistral large 3 Extended Blackwell Support ci/build nvidia performance

Performance-related issues

#29884 opened Dec 2, 2025 by hypdeb • Draft

[bugfix] fix MiniMaxM2ReasoningParser streaming output not separating reasoning_content.

#29882 opened Dec 2, 2025 by JaviS-Rei

Loading…

Improve and supplement LoRA tuning documentation performance

Performance-related issues

#29880 opened Dec 2, 2025 by caozuoba

Loading…

5 tasks

[BugFix] fix imgs_pos in hunyuan_vl

#29879 opened Dec 2, 2025 by wkcn • Draft

5 tasks

Scheduler: Prevent async loads from exploding the KV Cache v1

#29877 opened Dec 2, 2025 by orozery

Loading…

Support Deepseekv32 chat deepseek

Related to DeepSeek models

frontend

#29876 opened Dec 2, 2025 by yjc9696 • Draft

5 tasks

add toolparser for deepseek v3.2 reusing qwen xml parser deepseek

Related to DeepSeek models

frontend qwen

Related to Qwen models

tool-calling

#29874 opened Dec 2, 2025 by wenmengzhou

Loading…

Previous 1 2 3 4 5 … 51 52 Next

Previous Next

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!