We hold technical meeting on Friday 10:00am - 12:00am(UTC+8) every week. The meeting is by invitation only currently and any vllm-ascend related topic is welcome. If you have anything want to discuess, feel free to submit your topic under this issue.
2026.4.10 Chair : @realliujiaxu Topic :
Support DeepEp-Ascend for Moe - @zuje123 2026.4.3(cancel) 2026.3.27 Details Chair : @paulyu12 Topic :
Add Yuanrong backend support to KV Pool - @yangsonglin13 The design looks non-invasive to vLLM Ascend. Please focus on reliability and ease of use. Add score encoder cache manager - @hotTea123 It's a new embeding cache manager. advise to push it to vLLM repo. 2026.3.20 Details Chair : @wangxiyuan Topic :
Support SLA-Aware DVFS - @codeofdave Support prefill and decode Dynamic CP - @wangxiaochao6 Model tutorials template review & where the template should locate in the repository - @evakang777 Support an new load format: rfork -@GoMarck 2026.3.13 Details Chair : JUN HAN Topic :
vLLM and vLLM Ascend Update - @wangxiyuan @zzzzwwjj vLLM Omni Update - @hsliuustc0106 Support Chunked Pipeline Parallelism (CPP) - @gjc0824 4. Support SLA-Aware DVFS - @codeofdave 5. Support prefill and decode Dynamic CP - @wangxiaochao6 2026.3.06 (cancel) 2026.2.27 Details Chair : JinQi Yu Topic :
Design Review: Dual-Zone Update and Active Management Scheme for Prefix KV Cache - @xinrunxue SHMEM Integration: Introduction to CANN-SHMEM and Feasibility Analysis of Operator Integration - @JoyceAby vLLM Ascend update - @wangxiyuan Open Discussion: is there any way(lightweight test report、performance dashboard) to know some new models such as kimi k2.5/Qwen3.5/GLM5.0/... is well supported in vllm-ascend?- @7GrandPa 2026.1.23 Details Chair : @realliujiaxu Topic :
How to modify the point-to-point adaptation logic of the vLLM-Ascend quantization loading logic -@Feng-xiaosuo vLLM Ascend Update - Jun Han Lightweight Graph Mode: npugraph_ex - @panchao-hub vLLM Update - @wangxiyuan 2025 Details 2025.12.26 Chair : @paulyu12 Topic :
vLLM Weekly update improvement @wangxiyuan model improvement sync @ Jun Han Flash Comm v1 for VL models @realliujiaxu KVComp长序列稀疏化算法——GQA全显存方案 2025.12.19 Chair : @jianzs Topic :
EPLB:Ascend vs. Community - @shenchuxiaofugui [interface] User interface for flashcomm2 and layer_shard features - @zzhx1 vLLM-Ascend support HMA and cross layer KVCache - @zzzzwwjj 2025.12.12 (cancel) 2025.12.05 Chair : @wangxiyuan Topic :
vLLM Community weekly Update - @david6666666 https://docs.google.com/presentation/d/1_qBc9BhK4baQ9jeGraI1JpQiHGcjMaVf2KW7gjjkHFs/edit?slide=id.g399750550f0_1_0#slide=id.g399750550f0_1_0 vLLM Ascend progress @ Jun Han Async scheduler @Ronald1995 FA refactor @weijinqian0 [RFC]: Refactor Attention module #4629 Custom ops @zzzzwwjj 2025.11.28 (cancel) 2025.11.21 Chair : Jun Han Topic :
vLLM Community weekly Update - @MengqingCao vllm-ascend support sequence parallel for DS v3.2 model - @AlvisGong vllm-ascend VL modeling files removing - @shen-shanshan MindSpore model support - @wangtiance Add any topick below 2025.11.14 Chair : JinQi Yu Topic :
vLLM Community weekly Update - @david6666666 vllm-ascend support dump data in eager mode - @Tjh-UKN In vllm-ascend, KV Pool Elimination of Redundancy - @baxingpiaochong vllm-ascend support xlite graph wrapper - @lulina 2025.11.07 Chair : @realliujiaxu Topic :
vLLM Community weekly Update - @wangxiyuan vllm-ascend mla operator support all gather the result of W8A8_matmul(hiddenstate and wdq+wqdkv) - @chenlongxiao torch binding issues when adding custom ops into vllm-ascend - @ChenxiQ magicmtp - QingSen Han 2025.10.31 Chair : @paulyu12 Topic :
vLLM Community weekly Update - @MengqingCao vLLM important PRs in last week Q4: triton for Ascend token level re-inference - Li ShiLin Exception handling and recovering: kvcache, weights, activate value HBM UCE failed, network package losing... intrusive to ModelRunner. vLLM Ascend Model Support List guide revisit - Shen Xinjie Apply to Document. For each model, introduction main applied scenario for the model feature support link to model weight link to deploy tutorial vLLM Ascend Q4 RoadMap - Han Jun Q3 mooncake pooling chunk prefill aclgraph full graph long seq multi-stream parallel dynamic eplb w4a8 w4a4 kvcache resharding kvcache int8 quantization Q4 system-level schedule optimization hostbound between decoding process: async scheduling... RAS, DFX multimodal triton for Ascend inductor auto adaptor for short-long seq model support: tier 2 P/D quantization 2025.10.24 Chair : @ApsarasX Topic :
vLLM Community weekly Update - @david6666666 https://docs.google.com/presentation/d/1_qBc9BhK4baQ9jeGraI1JpQiHGcjMaVf2KW7gjjkHFs/edit?slide=id.p#slide=id.p
vLLM Ascend Model Suport List guide - @evakang777 , Shen Xinjie The customer hopes that the model support list in the vllm ascend community can provide some specifications and features. Together with PAE, we have sorted out the table headers to be added to the model support list based on customer requirements. It is expected that the vllm-ascend model supports the information displayed on the customer plane in the list.
Qwen3-Next support the ACL Graph feature. - @xueliangyang-oeuler
vllm-ascend PD Separation and TP Asymmetry Solution - @liziyu179 , Zhou Xuerong
For the PD separation layerwise push + TP asymmetry solution, the overlay of DP characteristics in stress testing scenarios has shown instability. We have proposed a new solution centered on D-node scheduling.
We hold technical meeting on Friday 10:00am - 12:00am(UTC+8) every week. The meeting is by invitation only currently and any vllm-ascend related topic is welcome. If you have anything want to discuess, feel free to submit your topic under this issue.
2026.4.10
Chair: @realliujiaxu
Topic:
2026.4.3(cancel)
2026.3.27
Details
Chair: @paulyu12
Topic:
2026.3.20
Details
Chair: @wangxiyuan
Topic:
2026.3.13
Details
Chair: JUN HAN
Topic:
4. Support SLA-Aware DVFS - @codeofdave5. Support prefill and decode Dynamic CP - @wangxiaochao62026.3.06 (cancel)
2026.2.27
Details
Chair: JinQi Yu
Topic:
2026.1.23
Details
Chair: @realliujiaxu
Topic:
2025
Details
2025.12.26
Chair: @paulyu12
Topic:
2025.12.19
Chair: @jianzs
Topic:
2025.12.12 (cancel)
2025.12.05
Chair: @wangxiyuan
Topic:
https://docs.google.com/presentation/d/1_qBc9BhK4baQ9jeGraI1JpQiHGcjMaVf2KW7gjjkHFs/edit?slide=id.g399750550f0_1_0#slide=id.g399750550f0_1_0
[RFC]: Refactor Attention module #4629
2025.11.28 (cancel)
2025.11.21
Chair: Jun Han
Topic:
2025.11.14
Chair: JinQi Yu
Topic:
2025.11.07
Chair: @realliujiaxu
Topic:
2025.10.31
Chair: @paulyu12
Topic:
2025.10.24
Chair: @ApsarasX
Topic:
vLLM Community weekly Update - @david6666666
https://docs.google.com/presentation/d/1_qBc9BhK4baQ9jeGraI1JpQiHGcjMaVf2KW7gjjkHFs/edit?slide=id.p#slide=id.p
vLLM Ascend Model Suport List guide - @evakang777 , Shen Xinjie
The customer hopes that the model support list in the vllm ascend community can provide some specifications and features. Together with PAE, we have sorted out the table headers to be added to the model support list based on customer requirements. It is expected that the vllm-ascend model supports the information displayed on the customer plane in the list.
Qwen3-Next support the ACL Graph feature. - @xueliangyang-oeulervllm-ascend PD Separation and TP Asymmetry Solution - @liziyu179, Zhou Xuerong
For the PD separation layerwise push + TP asymmetry solution, the overlay of DP characteristics in stress testing scenarios has shown instability. We have proposed a new solution centered on D-node scheduling.