- Notifications
You must be signed in to change notification settings - Fork 112
Pull requests: SemiAnalysisAI/InferenceX
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[AMD/ROCM] ATOM support for new models: Kimi-K2.5 FP4, GLM-5 FP8, and MiniMax-M2.5
#963 opened Mar 27, 2026 by seungrokj Loading…
[DRAFT] [AMD] Update Minimax M2.5 MI325 image and adjust search space AMD sweep-enabled
#953 opened Mar 27, 2026 by benenzhu Loading…
[WIP] B200 Minimax FP8 vllm upgrade NVIDIA sweep-enabled
#947 opened Mar 26, 2026 by kedarpotdar-nv Loading…
[WIP] Add Qwen3.5 h200 MTP NVIDIA sweep-enabled
#921 opened Mar 20, 2026 by hshrivastava-droid Loading…
Separate eval-only workflow and change to 8k1k sweep-enabled
#911 opened Mar 15, 2026 by Oseltamivir Loading…
fix: multi-turn benchmark hangs after all clients finish
#908 opened Mar 13, 2026 by lishicheng1996-nv Loading…
3 of 4 tasks
Add Kimi-K2.5 INT4 vLLM v0.16.0 benchmark for MI300X AMD sweep-enabled
#860 opened Mar 3, 2026 by functionstackx Loading…
Add MiniMax M2.5 MXFP4 benchmark for MI355x vLLM v0.17.1 (TP=2,4) AMD sweep-enabled
#827 opened Mar 1, 2026 by functionstackx Loading…
[NV] Qwen3.5 B200 SGLang FP4 configs NVIDIA sweep-enabled
#820 opened Feb 27, 2026 by kedarpotdar-nv Loading…
Performance Improvements for MI300X with GEMM and FP8 Enhancements
#811 opened Feb 26, 2026 by chunfangamd Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.