Skip to content

Pull requests: mlc-ai/mlc-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add GLM-4.5-Air MoE support
#3388 opened Nov 29, 2025 by otarkhan Loading… updated Nov 29, 2025
Qwen3VL
#3385 opened Nov 26, 2025 by yashagar-cmu Draft updated Nov 26, 2025
Add MRoPE operator stack and seed Qwen2.5‑VL integration
#3377 opened Nov 11, 2025 by MagellaX Loading… updated Nov 21, 2025
[Model] Qwen-2-VL Support
#3125 opened Feb 10, 2025 by nihalgeorge01 Draft updated Nov 19, 2025
[Python][Android] Fix FFI compatibility issues for Android build
#3368 opened Oct 24, 2025 by akaashrp Loading… updated Oct 24, 2025
[Testing] Add unit tests for Gemma3 model
#3362 opened Oct 14, 2025 by gururajkosuru Loading… updated Oct 14, 2025
checkpoint
#3357 opened Oct 4, 2025 by atebites-hub Loading… updated Oct 4, 2025
Add sequence padding to BeginForward
#3314 opened Aug 25, 2025 by joshua-j-hong Loading… updated Sep 21, 2025
[Model] Updated model preset with more models
#3313 opened Aug 25, 2025 by harrywhoo Loading… updated Sep 8, 2025
NUMA-aware tensor parallelism for CPU inference
#3320 opened Aug 30, 2025 by MagellaX Loading… updated Sep 3, 2025
Add API Key Authentication For openai_entrypoints
#3297 opened Aug 2, 2025 by rankaiyx Loading… updated Aug 27, 2025
Add ArceeForCausalLM support
#3294 opened Jul 27, 2025 by bartowski1182 Loading… updated Aug 15, 2025
Fix supported platforms
#3298 opened Aug 3, 2025 by zxcat Loading… updated Aug 3, 2025
Fix: Resolve pylint import errors and other warnings
#3265 opened Jun 27, 2025 by Mirza-Samad-Ahmed-Baig Loading… updated Jun 27, 2025
Add Comprehensive QAT Training Framework for MLC-LLM
#3258 opened Jun 23, 2025 by alohachen Loading… updated Jun 23, 2025
7 of 9 tasks
[Serving] PagedKVCache Quantization
#2663 opened Jul 16, 2024 by davidpissarra Loading… updated May 20, 2025
Perf: load weights, create KV cache, initialize tokenizer in parallel
#3215 opened Apr 27, 2025 by Bekaboo Loading… updated Apr 27, 2025
[Serving] Support tool function calls under strict format constraints
#3190 opened Mar 26, 2025 by Irfnfnkemed Loading… updated Apr 24, 2025
[Refactor] PagedKVCache spec for MLC-LLM
#3203 opened Apr 14, 2025 by annanyapr Loading… updated Apr 14, 2025
Refactored random.h to have PhiloxRandomGenerator
#3181 opened Mar 18, 2025 by annanyapr Loading… updated Apr 6, 2025
[CPP_CLI] MLC Cli App over JSONEngine interface
#3114 opened Jan 30, 2025 by srkreddy1238 Loading… updated Jan 31, 2025
[Bench] Add support for multiple backend
#3037 opened Nov 20, 2024 by cyx-6 Draft updated Nov 21, 2024
[SERVE][CPP][Android] add native executable program to benchmark models
#2987 opened Oct 18, 2024 by pfk-beta Loading… updated Oct 18, 2024
[Model] Add use_qk_norm option for Cohere model
#2877 opened Sep 2, 2024 by tlopex Loading… updated Oct 9, 2024
ProTip! Filter pull requests by the default branch with base:main.