Skip to content

[Community] Weekly Meeting Agenda #3642

@wangxiyuan

Description

@wangxiyuan

We hold technical meeting on Friday 10:00am - 12:00am(UTC+8) every week. The meeting is by invitation only currently and any vllm-ascend related topic is welcome. If you have anything want to discuess, feel free to submit your topic under this issue.

2026.4.10

Chair: @realliujiaxu
Topic:

  1. Support DeepEp-Ascend for Moe - @zuje123

2026.4.3(cancel)

2026.3.27

Details

Chair: @paulyu12
Topic:

  1. Add Yuanrong backend support to KV Pool - @yangsonglin13
  • The design looks non-invasive to vLLM Ascend.
  • Please focus on reliability and ease of use.
  1. Add score encoder cache manager - @hotTea123
  • It's a new embeding cache manager.
  • advise to push it to vLLM repo.

2026.3.20

Details

Chair: @wangxiyuan
Topic:

  1. Support SLA-Aware DVFS - @codeofdave
  2. Support prefill and decode Dynamic CP - @wangxiaochao6
  3. Model tutorials template review & where the template should locate in the repository - @evakang777
  4. Support an new load format: rfork -@GoMarck

2026.3.13

Details

Chair: JUN HAN
Topic:

  1. vLLM and vLLM Ascend Update - @wangxiyuan @zzzzwwjj
  2. vLLM Omni Update - @hsliuustc0106
  3. Support Chunked Pipeline Parallelism (CPP) - @gjc0824
    4. Support SLA-Aware DVFS - @codeofdave
    5. Support prefill and decode Dynamic CP - @wangxiaochao6

2026.3.06 (cancel)

2026.2.27

Details

Chair: JinQi Yu
Topic:

  1. Design Review: Dual-Zone Update and Active Management Scheme for Prefix KV Cache - @xinrunxue
  2. SHMEM Integration: Introduction to CANN-SHMEM and Feasibility Analysis of Operator Integration - @JoyceAby
  3. vLLM Ascend update - @wangxiyuan
  4. Open Discussion: is there any way(lightweight test report、performance dashboard) to know some new models such as kimi k2.5/Qwen3.5/GLM5.0/... is well supported in vllm-ascend?- @7GrandPa

2026.1.23

Details

Chair: @realliujiaxu
Topic:

  1. How to modify the point-to-point adaptation logic of the vLLM-Ascend quantization loading logic -@Feng-xiaosuo
  2. vLLM Ascend Update - Jun Han
  3. Lightweight Graph Mode: npugraph_ex - @panchao-hub
  4. vLLM Update - @wangxiyuan

2025

Details

2025.12.26

Chair: @paulyu12
Topic:

  1. vLLM Weekly update improvement @wangxiyuan
  2. model improvement sync @ Jun Han
  3. Flash Comm v1 for VL models @realliujiaxu
  4. KVComp长序列稀疏化算法——GQA全显存方案

2025.12.19

Chair: @jianzs
Topic:

  1. EPLB:Ascend vs. Community - @shenchuxiaofugui
  2. [interface] User interface for flashcomm2 and layer_shard features - @zzhx1
  3. vLLM-Ascend support HMA and cross layer KVCache - @zzzzwwjj

2025.12.12 (cancel)

2025.12.05

Chair: @wangxiyuan
Topic:

  1. vLLM Community weekly Update - @david6666666
    https://docs.google.com/presentation/d/1_qBc9BhK4baQ9jeGraI1JpQiHGcjMaVf2KW7gjjkHFs/edit?slide=id.g399750550f0_1_0#slide=id.g399750550f0_1_0
  2. vLLM Ascend progress @ Jun Han
  3. Async scheduler @Ronald1995
  4. FA refactor @weijinqian0
    [RFC]: Refactor Attention module #4629
  5. Custom ops @zzzzwwjj

2025.11.28 (cancel)

2025.11.21

Chair: Jun Han
Topic:

  1. vLLM Community weekly Update - @MengqingCao
  2. vllm-ascend support sequence parallel for DS v3.2 model - @AlvisGong
  3. vllm-ascend VL modeling files removing - @shen-shanshan
  4. MindSpore model support - @wangtiance
  5. Add any topick below

2025.11.14

Chair: JinQi Yu
Topic:

  1. vLLM Community weekly Update - @david6666666
  2. vllm-ascend support dump data in eager mode - @Tjh-UKN
  3. In vllm-ascend, KV Pool Elimination of Redundancy - @baxingpiaochong
  4. vllm-ascend support xlite graph wrapper - @lulina

2025.11.07

Chair: @realliujiaxu
Topic:

  1. vLLM Community weekly Update - @wangxiyuan
  2. vllm-ascend mla operator support all gather the result of W8A8_matmul(hiddenstate and wdq+wqdkv) - @chenlongxiao
  3. torch binding issues when adding custom ops into vllm-ascend - @ChenxiQ
  4. magicmtp - QingSen Han

2025.10.31

Chair: @paulyu12
Topic:

  1. vLLM Community weekly Update - @MengqingCao
  • vLLM important PRs in last week
  • Q4: triton for Ascend
  1. token level re-inference - Li ShiLin
  • Exception handling and recovering: kvcache, weights, activate value HBM UCE failed, network package losing...
  • intrusive to ModelRunner.
  1. vLLM Ascend Model Support List guide revisit - Shen Xinjie
  • Apply to Document. For each model,
    • introduction
    • main applied scenario for the model
    • feature support
    • link to model weight
    • link to deploy tutorial
  1. vLLM Ascend Q4 RoadMap - Han Jun
  • Q3
    • mooncake pooling
    • chunk prefill
    • aclgraph full graph
    • long seq
    • multi-stream parallel
    • dynamic eplb
    • w4a8 w4a4
    • kvcache resharding
    • kvcache int8 quantization
  • Q4
    • system-level schedule optimization
    • hostbound between decoding process: async scheduling...
    • RAS, DFX
    • multimodal
    • triton for Ascend
    • inductor
    • auto adaptor for short-long seq
    • model support: tier 2
    • P/D quantization

2025.10.24

Chair: @ApsarasX
Topic:

  1. vLLM Community weekly Update - @david6666666
    https://docs.google.com/presentation/d/1_qBc9BhK4baQ9jeGraI1JpQiHGcjMaVf2KW7gjjkHFs/edit?slide=id.p#slide=id.p

  2. vLLM Ascend Model Suport List guide - @evakang777 , Shen Xinjie
    The customer hopes that the model support list in the vllm ascend community can provide some specifications and features. Together with PAE, we have sorted out the table headers to be added to the model support list based on customer requirements. It is expected that the vllm-ascend model supports the information displayed on the customer plane in the list.

  3. Qwen3-Next support the ACL Graph feature. - @xueliangyang-oeuler

  4. vllm-ascend PD Separation and TP Asymmetry Solution - @liziyu179, Zhou Xuerong

    For the PD separation layerwise push + TP asymmetry solution, the overlay of DP characteristics in stress testing scenarios has shown instability. We have proposed a new solution centered on D-node scheduling.

Metadata

Metadata

Assignees

No one assigned

    Labels

    guideguide note

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions