Supported Models and Datasets
The table below introduces the models integrated with ms-swift:
Model ID: Model ID for the ModelScope Model
HF Model ID: Hugging Face Model ID
Model Type: Type of the model
Default Template: Default chat template
Requires: Additional dependencies required to use the model
Tags: Tags associated with the model
Large Language Models
Model ID | Model Type | Default Template | Requires | Support Megatron | Tags | HF Model ID |
|---|---|---|---|---|---|---|
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | - | ||
qwen | qwen | - | ✘ | financial | ||
qwen | qwen | - | ✘ | financial | - | |
qwen | qwen | - | ✘ | financial | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | coding | ||
qwen2 | qwen | transformers>=4.37 | ✔ | coding | ||
qwen2 | qwen | transformers>=4.37 | ✘ | coding | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✘ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | math | ||
qwen2 | qwen | transformers>=4.37 | ✔ | math | ||
qwen2 | qwen | transformers>=4.37 | ✔ | math | ||
qwen2 | qwen | transformers>=4.37 | ✔ | math | ||
qwen2 | qwen | transformers>=4.37 | ✔ | math | ||
qwen2 | qwen | transformers>=4.37 | ✔ | math | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2 | qwen | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | - | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✘ | coding | ||
qwen2_5 | qwen2_5 | transformers>=4.37 | ✔ | - | ||
qwen2_5_math | qwen2_5_math | transformers>=4.37 | ✔ | math | ||
qwen2_5_math | qwen2_5_math | transformers>=4.37 | ✔ | math | ||
qwen2_5_math | qwen2_5_math | transformers>=4.37 | ✔ | math | ||
qwen2_5_math | qwen2_5_math | transformers>=4.37 | ✔ | math | ||
qwen2_5_math | qwen2_5_math | transformers>=4.37 | ✔ | math | ||
qwen2_5_math | qwen2_5_math | transformers>=4.37 | ✔ | math | ||
qwen2_moe | qwen | transformers>=4.40 | ✔ | - | ||
qwen2_moe | qwen | transformers>=4.40 | ✔ | - | ||
qwen2_moe | qwen | transformers>=4.40 | ✘ | - | ||
qwen2_moe | qwen | transformers>=4.40 | ✔ | - | ||
qwen2_moe | qwen | transformers>=4.40 | ✔ | - | ||
qwen2_moe | qwen | transformers>=4.40 | ✘ | - | ||
qwq_preview | qwq_preview | transformers>=4.37 | ✔ | - | ||
qwq | qwq | transformers>=4.37 | ✔ | - | ||
qwq | qwq | transformers>=4.37 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3 | qwen3 | transformers>=4.51 | ✘ | - | - | |
qwen3_guard | qwen3_guard | transformers>=4.51 | ✘ | - | ||
qwen3_guard | qwen3_guard | transformers>=4.51 | ✘ | - | ||
qwen3_guard | qwen3_guard | transformers>=4.51 | ✘ | - | ||
qwen3_thinking | qwen3_thinking | transformers>=4.51 | ✔ | - | ||
qwen3_thinking | qwen3_thinking | transformers>=4.51 | ✘ | - | ||
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✔ | - | ||
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✘ | - | ||
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✔ | - | ||
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✘ | - | ||
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✘ | - | - | |
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✔ | - | ||
qwen3_nothinking | qwen3_nothinking | transformers>=4.51 | ✘ | - | ||
qwen3_coder | qwen3_coder | transformers>=4.51 | ✔ | coding | ||
qwen3_coder | qwen3_coder | transformers>=4.51 | ✘ | coding | ||
qwen3_coder | qwen3_coder | transformers>=4.51 | ✔ | coding | ||
qwen3_coder | qwen3_coder | transformers>=4.51 | ✘ | coding | ||
qwen3_coder | qwen3_coder | transformers>=4.51 | ✘ | coding | - | |
qwen3_moe | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✘ | - | ||
qwen3_moe | qwen3 | transformers>=4.51 | ✔ | - | ||
qwen3_moe_thinking | qwen3_thinking | transformers>=4.51 | ✔ | - | ||
qwen3_moe_thinking | qwen3_thinking | transformers>=4.51 | ✘ | - | ||
qwen3_moe_thinking | qwen3_thinking | transformers>=4.51 | ✔ | - | ||
qwen3_moe_thinking | qwen3_thinking | transformers>=4.51 | ✘ | - | ||
qwen3_moe_thinking | qwen3_thinking | transformers>=4.51 | ✘ | - | - | |
qwen3_next | qwen3_nothinking | transformers>=4.57 | ✔ | - | - | |
qwen3_next | qwen3_nothinking | transformers>=4.57 | ✘ | - | - | |
qwen3_next_thinking | qwen3_thinking | transformers>=4.57 | ✔ | - | - | |
qwen3_next_thinking | qwen3_thinking | transformers>=4.57 | ✘ | - | - | |
qwen3_emb | qwen3_emb | - | ✘ | - | ||
qwen3_emb | qwen3_emb | - | ✘ | - | ||
qwen3_emb | qwen3_emb | - | ✘ | - | ||
qwen2_gte | dummy | - | ✘ | - | ||
qwen2_gte | dummy | - | ✘ | - | ||
codefuse_qwen | codefuse | - | ✘ | coding | ||
modelscope_agent | modelscope_agent | - | ✘ | - | - | |
modelscope_agent | modelscope_agent | - | ✘ | - | - | |
marco_o1 | marco_o1 | transformers>=4.37 | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | - | ✔ | - | ||
llama | llama | transformers>=4.38, aqlm, torch>=2.2.0 | ✘ | - | ||
llama3 | llama3 | - | ✔ | - | ||
llama3 | llama3 | - | ✔ | - | ||
llama3 | llama3 | - | ✔ | - | ||
llama3 | llama3 | - | ✔ | - | ||
llama3 | llama3 | - | ✘ | - | ||
llama3 | llama3 | - | ✘ | - | ||
llama3 | llama3 | - | ✘ | - | ||
llama3 | llama3 | - | ✘ | - | ||
llama3 | llama3 | - | ✘ | - | ||
llama3 | llama3 | - | ✘ | - | ||
llama3 | llama3 | - | ✔ | - | ||
llama3 | llama3 | - | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_1 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_2 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_2 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_2 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_2 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_2 | llama3_2 | transformers>=4.43 | ✔ | - | ||
llama3_2 | llama3_2 | transformers>=4.43 | ✘ | - | ||
reflection | reflection | transformers>=4.43 | ✔ | - | ||
megrez | megrez | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✔ | - | ||
yi | chatml | - | ✘ | - | ||
yi | chatml | - | ✘ | - | ||
yi | chatml | - | ✘ | - | ||
yi | chatml | - | ✘ | - | ||
yi | chatml | - | ✘ | - | ||
yi | chatml | - | ✘ | - | ||
yi_coder | yi_coder | - | ✔ | coding | ||
yi_coder | yi_coder | - | ✔ | coding | ||
yi_coder | yi_coder | - | ✔ | coding | ||
yi_coder | yi_coder | - | ✔ | coding | ||
sus | sus | - | ✔ | - | ||
gpt_oss | gpt_oss | transformers>=4.55 | ✔ | - | ||
gpt_oss | gpt_oss | transformers>=4.55 | ✔ | - | ||
seed_oss | seed_oss | transformers>=4.56 | ✘ | - | ||
seed_oss | seed_oss | transformers>=4.56 | ✘ | - | ||
seed_oss | seed_oss | transformers>=4.56 | ✘ | - | ||
codefuse_codellama | codefuse_codellama | - | ✔ | coding | ||
mengzi3 | mengzi | - | ✔ | - | ||
ziya | ziya | - | ✔ | - | ||
ziya | ziya | - | ✔ | - | ||
numina | numina | - | ✔ | math | ||
atom | atom | - | ✘ | - | ||
atom | atom | - | ✘ | - | ||
chatglm2 | chatglm2 | transformers<4.42 | ✘ | - | ||
chatglm2 | chatglm2 | transformers<4.42 | ✘ | - | ||
chatglm2 | chatglm2 | transformers<4.34 | ✘ | coding | ||
chatglm3 | glm4 | transformers<4.42 | ✘ | - | ||
chatglm3 | glm4 | transformers<4.42 | ✘ | - | ||
chatglm3 | glm4 | transformers<4.42 | ✘ | - | ||
chatglm3 | glm4 | transformers<4.42 | ✘ | - | ||
glm4 | glm4 | transformers>=4.42 | ✘ | - | ||
glm4 | glm4 | transformers>=4.42 | ✘ | - | ||
glm4 | glm4 | transformers>=4.42 | ✘ | - | ||
glm4 | glm4 | transformers>=4.42 | ✘ | - | ||
glm4_0414 | glm4_0414 | transformers>=4.51 | ✘ | - | ||
glm4_0414 | glm4_0414 | transformers>=4.51 | ✘ | - | ||
glm4_0414 | glm4_0414 | transformers>=4.51 | ✘ | - | ||
glm4_0414 | glm4_0414 | transformers>=4.51 | ✘ | - | ||
glm4_0414 | glm4_0414 | transformers>=4.51 | ✘ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✔ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✔ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✘ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✔ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✔ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✘ | - | ||
glm4_5 | glm4_5 | transformers>=4.54 | ✔ | - | ||
glm4_z1_rumination | glm4_z1_rumination | transformers>4.51 | ✘ | - | ||
glm_edge | glm4 | transformers>=4.46 | ✘ | - | ||
glm_edge | glm4 | transformers>=4.46 | ✘ | - | ||
codefuse_codegeex2 | codefuse | transformers<4.34 | ✘ | coding | ||
codegeex4 | codegeex4 | transformers<4.42 | ✘ | coding | ||
longwriter_llama3_1 | longwriter_llama | transformers>=4.43 | ✔ | - | ||
internlm | internlm | - | ✘ | - | ||
internlm | internlm | - | ✘ | - | ||
internlm | internlm | - | ✘ | - | - | |
internlm | internlm | - | ✘ | - | ||
internlm | internlm | - | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | math | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | math | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | math | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | math | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm2 | internlm2 | transformers>=4.38 | ✘ | - | ||
internlm3 | internlm2 | transformers>=4.48 | ✔ | - | ||
deepseek | deepseek | - | ✔ | - | ||
deepseek | deepseek | - | ✔ | - | ||
deepseek | deepseek | - | ✔ | - | ||
deepseek | deepseek | - | ✔ | - | ||
deepseek | deepseek | - | ✔ | math | ||
deepseek | deepseek | - | ✔ | math | ||
deepseek | deepseek | - | ✔ | math | ||
deepseek | deepseek | - | ✔ | coding | ||
deepseek | deepseek | - | ✔ | coding | ||
deepseek | deepseek | - | ✔ | coding | ||
deepseek | deepseek | - | ✔ | coding | ||
deepseek | deepseek | - | ✔ | coding | ||
deepseek | deepseek | - | ✔ | coding | ||
deepseek_moe | deepseek | - | ✔ | - | ||
deepseek_moe | deepseek | - | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2 | deepseek | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✘ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✘ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v2_5 | deepseek_v2_5 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✘ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✘ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1 | deepseek_r1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | transformers>=4.37 | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | transformers>=4.37 | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | transformers>=4.37 | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | transformers>=4.37 | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | transformers>=4.37 | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | - | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | - | ✔ | - | ||
deepseek_r1_distill | deepseek_r1 | - | ✔ | - | ||
deepseek_v3_1 | deepseek_v3_1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v3_1 | deepseek_v3_1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v3_1 | deepseek_v3_1 | transformers>=4.39.3 | ✔ | - | ||
deepseek_v3_2 | deepseek_v3_1 | - | ✘ | - | ||
deepseek_v3_2 | deepseek_v3_1 | - | ✘ | - | ||
deepseek_v3_2 | deepseek_v3_1 | - | ✘ | - | ||
deepseek_v3_2 | deepseek_v3_1 | - | ✘ | - | ||
deepseek_v3_2 | deepseek_v3_1 | - | ✘ | - | ||
openbuddy_llama | openbuddy | - | ✔ | - | ||
openbuddy_llama | openbuddy | - | ✔ | - | ||
openbuddy_llama | openbuddy | - | ✔ | - | ||
openbuddy_llama | openbuddy | - | ✔ | - | ||
openbuddy_llama3 | openbuddy2 | - | ✔ | - | ||
openbuddy_llama3 | openbuddy2 | - | ✔ | - | ||
openbuddy_llama3 | openbuddy2 | - | ✔ | - | ||
openbuddy_llama3 | openbuddy2 | transformers>=4.43 | ✔ | - | ||
openbuddy_llama3 | openbuddy2 | transformers>=4.43 | ✔ | - | ||
openbuddy_llama3 | openbuddy2 | transformers>=4.45 | ✔ | - | ||
openbuddy_mistral | openbuddy | transformers>=4.34 | ✘ | - | ||
openbuddy_mistral | openbuddy | transformers>=4.34 | ✘ | - | ||
openbuddy_mixtral | openbuddy | transformers>=4.36 | ✘ | - | ||
baichuan | baichuan | transformers<4.34 | ✘ | - | ||
baichuan | baichuan | transformers<4.34 | ✘ | - | ||
baichuan | baichuan | transformers<4.34 | ✘ | - | ||
baichuan2 | baichuan | - | ✘ | - | ||
baichuan2 | baichuan | - | ✘ | - | ||
baichuan2 | baichuan | - | ✘ | - | ||
baichuan2 | baichuan | - | ✘ | - | ||
baichuan2 | baichuan | bitsandbytes<0.41.2, accelerate<0.26 | ✘ | - | ||
baichuan2 | baichuan | bitsandbytes<0.41.2, accelerate<0.26 | ✘ | - | ||
baichuan_m1 | baichuan_m1 | transformers>=4.48 | ✘ | - | ||
minicpm | minicpm | transformers>=4.36.0 | ✘ | - | ||
minicpm | minicpm | transformers>=4.36.0 | ✘ | - | ||
minicpm | minicpm | transformers>=4.36.0 | ✘ | - | ||
minicpm_chatml | chatml | transformers>=4.36 | ✘ | - | ||
minicpm_chatml | chatml | transformers>=4.36 | ✘ | - | ||
minicpm_chatml | chatml | transformers>=4.36 | ✘ | - | ||
minicpm3 | chatml | transformers>=4.36 | ✘ | - | ||
minicpm_moe | minicpm | transformers>=4.36 | ✘ | - | ||
telechat | telechat | - | ✘ | - | ||
telechat | telechat | - | ✘ | - | ||
telechat | telechat | - | ✘ | - | ||
telechat | telechat | - | ✘ | - | ||
telechat | telechat | - | ✘ | - | - | |
telechat | telechat | - | ✘ | - | ||
telechat | telechat | - | ✘ | - | ||
telechat2 | telechat2 | - | ✘ | - | ||
telechat2 | telechat2 | - | ✘ | - | ||
telechat2 | telechat2 | - | ✘ | - | ||
telechat2 | telechat2 | - | ✘ | - | ||
mistral | llama | transformers>=4.34 | ✘ | - | ||
mistral | llama | transformers>=4.34 | ✘ | - | ||
mistral | llama | transformers>=4.34 | ✘ | - | ||
mistral | llama | transformers>=4.34 | ✘ | - | ||
mistral | llama | transformers>=4.34 | ✘ | - | ||
mistral | llama | transformers>=4.34 | ✘ | - | ||
devstral | devstral | transformers>=4.43, mistral-common>=1.5.5 | ✘ | - | ||
zephyr | zephyr | transformers>=4.34 | ✘ | - | ||
mixtral | llama | transformers>=4.36 | ✘ | - | ||
mixtral | llama | transformers>=4.36 | ✘ | - | ||
mixtral | llama | transformers>=4.36 | ✘ | - | ||
mixtral | llama | transformers>=4.38, aqlm, torch>=2.2.0 | ✘ | - | ||
mistral_nemo | mistral_nemo | transformers>=4.43 | ✘ | - | ||
mistral_nemo | mistral_nemo | transformers>=4.43 | ✘ | - | ||
mistral_nemo | mistral_nemo | transformers>=4.43 | ✘ | - | ||
mistral_nemo | mistral_nemo | transformers>=4.43 | ✘ | - | ||
mistral_nemo | mistral_nemo | transformers>=4.46 | ✘ | - | ||
mistral_2501 | mistral_2501 | - | ✘ | - | ||
mistral_2501 | mistral_2501 | - | ✘ | - | ||
wizardlm2 | wizardlm2 | transformers>=4.34 | ✘ | - | ||
wizardlm2_moe | wizardlm2_moe | transformers>=4.36 | ✘ | - | ||
phi2 | default | - | ✘ | - | ||
phi3_small | phi3 | transformers>=4.36 | ✘ | - | ||
phi3_small | phi3 | transformers>=4.36 | ✘ | - | ||
phi3 | phi3 | transformers>=4.36 | ✘ | - | ||
phi3 | phi3 | transformers>=4.36 | ✘ | - | ||
phi3 | phi3 | transformers>=4.36 | ✘ | - | ||
phi3 | phi3 | transformers>=4.36 | ✘ | - | ||
phi3 | phi3 | transformers>=4.36 | ✘ | - | ||
phi3 | phi3 | transformers>=4.36 | ✘ | - | ||
phi3_moe | phi3 | transformers>=4.36 | ✘ | - | ||
phi4 | phi4 | transformers>=4.36 | ✘ | - | ||
minimax | minimax | - | ✘ | - | ||
minimax_m1 | minimax_m1 | - | ✘ | - | ||
minimax_m1 | minimax_m1 | - | ✘ | - | ||
minimax_m2 | minimax_m2 | - | ✘ | - | ||
gemma | gemma | transformers>=4.38 | ✘ | - | ||
gemma | gemma | transformers>=4.38 | ✘ | - | ||
gemma | gemma | transformers>=4.38 | ✘ | - | ||
gemma | gemma | transformers>=4.38 | ✘ | - | ||
gemma2 | gemma | transformers>=4.42 | ✘ | - | ||
gemma2 | gemma | transformers>=4.42 | ✘ | - | ||
gemma2 | gemma | transformers>=4.42 | ✘ | - | ||
gemma2 | gemma | transformers>=4.42 | ✘ | - | ||
gemma2 | gemma | transformers>=4.42 | ✘ | - | ||
gemma2 | gemma | transformers>=4.42 | ✘ | - | ||
gemma3_text | gemma3_text | transformers>=4.49 | ✘ | - | ||
gemma3_text | gemma3_text | transformers>=4.49 | ✘ | - | ||
gemma3_text | gemma3_text | transformers>=4.49 | ✘ | - | ||
gemma3_text | gemma3_text | transformers>=4.49 | ✘ | - | ||
skywork | skywork | - | ✘ | - | ||
skywork | skywork | - | ✘ | - | - | |
skywork_o1 | skywork_o1 | transformers>=4.43 | ✔ | - | ||
ling | ling | - | ✘ | - | ||
ling | ling | - | ✘ | - | ||
ling | ling | - | ✘ | - | ||
ling | ling | - | ✘ | - | ||
ling2 | ling2 | - | ✘ | - | ||
ling2 | ling2 | - | ✘ | - | ||
ring2 | ring2 | - | ✘ | - | ||
yuan2 | yuan | - | ✘ | - | ||
yuan2 | yuan | - | ✘ | - | ||
yuan2 | yuan | - | ✘ | - | ||
yuan2 | yuan | - | ✘ | - | ||
yuan2 | yuan | - | ✘ | - | ||
orion | orion | - | ✘ | - | ||
orion | orion | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse | xverse | - | ✘ | - | ||
xverse_moe | xverse | - | ✘ | - | ||
seggpt | default | - | ✘ | - | ||
bluelm | bluelm | - | ✘ | - | ||
bluelm | bluelm | - | ✘ | - | ||
bluelm | bluelm | - | ✘ | - | ||
bluelm | bluelm | - | ✘ | - | ||
c4ai | c4ai | transformers>=4.39 | ✘ | - | ||
c4ai | c4ai | transformers>=4.39 | ✘ | - | ||
dbrx | dbrx | transformers>=4.36 | ✘ | - | ||
dbrx | dbrx | transformers>=4.36 | ✘ | - | ||
grok | default | - | ✘ | - | ||
mamba | default | transformers>=4.39.0 | ✘ | - | ||
mamba | default | transformers>=4.39.0 | ✘ | - | ||
mamba | default | transformers>=4.39.0 | ✘ | - | ||
mamba | default | transformers>=4.39.0 | ✘ | - | ||
mamba | default | transformers>=4.39.0 | ✘ | - | ||
mamba | default | transformers>=4.39.0 | ✘ | - | ||
polylm | default | - | ✘ | - | ||
aya | aya | transformers>=4.44.0 | ✘ | - | ||
aya | aya | transformers>=4.44.0 | ✘ | - | ||
moonlight | moonlight | transformers<4.49 | ✔ | - | ||
moonlight | moonlight | transformers<4.49 | ✔ | - | ||
kimi_k2 | kimi_k2 | - | ✔ | - | ||
kimi_k2 | kimi_k2 | - | ✔ | - | ||
kimi_k2 | kimi_k2 | - | ✔ | - | ||
kimi_k2 | kimi_k2 | - | ✔ | - | ||
mimo | qwen | transformers>=4.37 | ✔ | - | ||
mimo | qwen | transformers>=4.37 | ✔ | - | ||
mimo | qwen | transformers>=4.37 | ✔ | - | ||
mimo | qwen | transformers>=4.37 | ✔ | - | ||
mimo_rl | mimo_rl | transformers>=4.37 | ✔ | - | ||
dots1 | dots1 | transformers>=4.53 | ✔ | - | ||
dots1 | dots1 | transformers>=4.53 | ✔ | - | ||
hunyuan_moe | hunyuan_moe | - | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
hunyuan | hunyuan | transformers>=4.55.0.dev0 | ✘ | - | ||
ernie | ernie | - | ✔ | - | ||
ernie | ernie | - | ✔ | - | ||
ernie | ernie | - | ✔ | - | ||
ernie | ernie | - | ✔ | - | ||
ernie | ernie | - | ✔ | - | ||
ernie | ernie | - | ✔ | - | ||
gemma_emb | dummy | - | ✘ | - | ||
ernie_thinking | ernie_thinking | - | ✔ | - | ||
longchat | longchat | transformers>=4.54,<4.56 | ✘ | - | ||
longchat | longchat | transformers>=4.54,<4.56 | ✘ | - | ||
modern_bert | dummy | transformers>=4.48 | ✘ | bert | ||
modern_bert | dummy | transformers>=4.48 | ✘ | bert | ||
modern_bert_gte | dummy | transformers>=4.48 | ✘ | bert, embedding | ||
bert | dummy | - | ✘ | bert | - | |
internlm2_reward | internlm2_reward | transformers>=4.38 | ✘ | - | ||
internlm2_reward | internlm2_reward | transformers>=4.38 | ✘ | - | ||
internlm2_reward | internlm2_reward | transformers>=4.38 | ✘ | - | ||
qwen2_reward | qwen | transformers>=4.37 | ✘ | - | ||
qwen2_5_prm | qwen2_5_math_prm | transformers>=4.37 | ✘ | - | ||
qwen2_5_prm | qwen2_5_math_prm | transformers>=4.37 | ✘ | - | ||
qwen2_5_prm | qwen2_5_math_prm | transformers>=4.37 | ✘ | - | ||
qwen2_5_math_reward | qwen2_5_math | transformers>=4.37 | ✘ | - | ||
llama3_2_reward | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_2_reward | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_2_reward | llama3_2 | transformers>=4.43 | ✘ | - | ||
llama3_2_reward | llama3_2 | transformers>=4.43 | ✘ | - | ||
gemma_reward | gemma | transformers>=4.42 | ✘ | - | ||
gemma_reward | gemma | transformers>=4.42 | ✘ | - | ||
bge_reranker | bge_reranker | - | ✘ | - | ||
bge_reranker | bge_reranker | - | ✘ | - | ||
bge_reranker | bge_reranker | - | ✘ | - | ||
modern_bert_gte_reranker | bert | transformers>=4.48 | ✘ | bert, reranker | ||
qwen3_reranker | qwen3_reranker | - | ✘ | - | ||
qwen3_reranker | qwen3_reranker | - | ✘ | - | ||
qwen3_reranker | qwen3_reranker | - | ✘ | - |
Multimodal large models
Model ID | Model Type | Default Template | Requires | Support Megatron | Tags | HF Model ID |
|---|---|---|---|---|---|---|
qwen_vl | qwen_vl | - | ✘ | vision | ||
qwen_vl | qwen_vl | - | ✘ | vision | ||
qwen_vl | qwen_vl | - | ✘ | vision | ||
qwen_audio | qwen_audio | - | ✘ | audio | ||
qwen_audio | qwen_audio | - | ✘ | audio | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_vl | qwen2_vl | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✔ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_5_vl | qwen2_5_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_5_omni | qwen2_5_omni | transformers>=4.50, soundfile, qwen_omni_utils, decord | ✔ | vision, video, audio | ||
qwen2_5_omni | qwen2_5_omni | transformers>=4.50, soundfile, qwen_omni_utils, decord | ✔ | vision, video, audio | ||
qwen3_omni | qwen3_omni | transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils | ✔ | vision, video, audio | ||
qwen3_omni | qwen3_omni | transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils | ✔ | vision, video, audio | ||
qwen3_omni | qwen3_omni | transformers>=4.57.dev0, soundfile, decord, qwen_omni_utils | ✔ | vision, video, audio | ||
qwen2_audio | qwen2_audio | transformers>=4.45,<4.49, librosa | ✘ | audio | ||
qwen2_audio | qwen2_audio | transformers>=4.45,<4.49, librosa | ✘ | audio | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✔ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qwen3_moe_vl | qwen3_vl | transformers>=4.57, qwen_vl_utils>=0.0.14, decord | ✘ | vision, video | ||
qvq | qvq | transformers>=4.45, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
qwen2_gme | qwen2_gme | - | ✘ | vision | ||
qwen2_gme | qwen2_gme | - | ✘ | vision | ||
ovis1_6 | ovis1_6 | transformers>=4.42 | ✘ | vision | ||
ovis1_6 | ovis1_6 | transformers>=4.42 | ✘ | vision | ||
ovis1_6 | ovis1_6 | transformers>=4.42 | ✘ | vision | ||
ovis1_6_llama3 | ovis1_6_llama3 | - | ✘ | vision | ||
ovis2 | ovis2 | transformers>=4.46.2, moviepy<2 | ✘ | vision | ||
ovis2 | ovis2 | transformers>=4.46.2, moviepy<2 | ✘ | vision | ||
ovis2 | ovis2 | transformers>=4.46.2, moviepy<2 | ✘ | vision | ||
ovis2 | ovis2 | transformers>=4.46.2, moviepy<2 | ✘ | vision | ||
ovis2 | ovis2 | transformers>=4.46.2, moviepy<2 | ✘ | vision | ||
ovis2 | ovis2 | transformers>=4.46.2, moviepy<2 | ✘ | vision | ||
ovis2_5 | ovis2_5 | transformers>=4.46.2, moviepy<2 | ✔ | vision | ||
ovis2_5 | ovis2_5 | transformers>=4.46.2, moviepy<2 | ✔ | vision | ||
mimo_vl | mimo_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
mimo_vl | mimo_vl | transformers>=4.49, qwen_vl_utils>=0.0.6, decord | ✘ | vision, video | ||
midashenglm | midashenglm | transformers>=4.52, soundfile | ✘ | audio | ||
glm4v | glm4v | transformers>=4.42,<4.45 | ✘ | - | ||
glm4v | glm4v | transformers>=4.42 | ✘ | - | ||
glm4_1v | glm4_1v | transformers>=4.53 | ✘ | - | ||
glm4_1v | glm4_1v | transformers>=4.53 | ✘ | - | ||
glm4_1v | glm4_1v | transformers>=4.57 | ✘ | - | ||
glm4_5v | glm4_5v | transformers>=4.56 | ✔ | - | ||
glm4_5v | glm4_5v | transformers>=4.56 | ✘ | - | ||
glm_edge_v | glm_edge_v | transformers>=4.46 | ✘ | vision | ||
glm_edge_v | glm_edge_v | transformers>=4.46 | ✘ | vision | ||
cogvlm | cogvlm | transformers<4.42 | ✘ | - | ||
cogagent_vqa | cogagent_vqa | transformers<4.42 | ✘ | - | ||
cogagent_chat | cogagent_chat | transformers<4.42, timm | ✘ | - | ||
cogvlm2 | cogvlm2 | transformers<4.42 | ✘ | - | ||
cogvlm2 | cogvlm2 | transformers<4.42 | ✘ | - | ||
cogvlm2_video | cogvlm2_video | decord, pytorchvideo, transformers>=4.42 | ✘ | video | ||
internvl | internvl | transformers>=4.35, timm | ✘ | vision | ||
internvl | internvl | transformers>=4.35, timm | ✘ | vision | ||
internvl | internvl | transformers>=4.35, timm | ✘ | vision | ||
internvl_phi3 | internvl_phi3 | transformers>=4.35,<4.42, timm | ✘ | vision | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | ||
OpenGVLab/InternVL2-Pretrain-Models:InternVL2-Llama3-76B-Pretrain | internvl2 | internvl2 | transformers>=4.36, timm | ✘ | vision, video | OpenGVLab/InternVL2-Pretrain-Models:InternVL2-Llama3-76B-Pretrain |
internvl2_phi3 | internvl2_phi3 | transformers>=4.36,<4.42, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl2_5 | internvl2_5 | transformers>=4.36, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✘ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3 | internvl2_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl_hf | internvl_hf | transformers>=4.52.1, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5 | internvl3_5 | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl3_5_gpt | internvl3_5_gpt | transformers>=4.37.2, timm | ✔ | vision, video | ||
internvl_gpt_hf | internvl_hf | transformers>=4.55.0, timm | ✔ | vision, video | ||
interns1 | interns1 | transformers>=4.55.2,<4.56 | ✘ | vision, video | ||
interns1 | interns1 | transformers>=4.55.2,<4.56 | ✘ | vision, video | ||
interns1 | interns1 | transformers>=4.55.2,<4.56 | ✘ | vision, video | ||
interns1 | interns1 | transformers>=4.55.2,<4.56 | ✘ | vision, video | ||
xcomposer2 | ixcomposer2 | - | ✘ | vision | ||
xcomposer2_4khd | ixcomposer2 | - | ✘ | vision | ||
xcomposer2_5 | xcomposer2_5 | decord | ✘ | vision | ||
xcomposer2_5 | xcomposer2_5 | decord | ✘ | vision | ||
xcomposer2_5_ol_audio | qwen2_audio | transformers>=4.45 | ✘ | audio | ||
llama3_2_vision | llama3_2_vision | transformers>=4.45 | ✘ | vision | ||
llama3_2_vision | llama3_2_vision | transformers>=4.45 | ✘ | vision | ||
llama3_2_vision | llama3_2_vision | transformers>=4.45 | ✘ | vision | ||
llama3_2_vision | llama3_2_vision | transformers>=4.45 | ✘ | vision | ||
llama4 | llama4 | transformers>=4.51 | ✔ | vision | ||
llama4 | llama4 | transformers>=4.51 | ✔ | vision | ||
llama4 | llama4 | transformers>=4.51 | ✔ | vision | ||
llama4 | llama4 | transformers>=4.51 | ✘ | vision | ||
llama4 | llama4 | transformers>=4.51 | ✔ | vision | ||
llama3_1_omni | llama3_1_omni | openai-whisper | ✘ | audio | ||
llava1_5_hf | llava1_5_hf | transformers>=4.36 | ✘ | vision | ||
llava1_5_hf | llava1_5_hf | transformers>=4.36 | ✘ | vision | ||
llava1_6_mistral_hf | llava1_6_mistral_hf | transformers>=4.39 | ✘ | vision | ||
llava1_6_vicuna_hf | llava1_6_vicuna_hf | transformers>=4.39 | ✘ | vision | ||
llava1_6_vicuna_hf | llava1_6_vicuna_hf | transformers>=4.39 | ✘ | vision | ||
llava1_6_yi_hf | llava1_6_yi_hf | transformers>=4.39 | ✘ | vision | ||
llama3_llava_next_hf | llama3_llava_next_hf | transformers>=4.39 | ✘ | vision | ||
llava_next_qwen_hf | llava_next_qwen_hf | transformers>=4.39 | ✘ | vision | ||
llava_next_qwen_hf | llava_next_qwen_hf | transformers>=4.39 | ✘ | vision | ||
llava_next_video_hf | llava_next_video_hf | transformers>=4.42, av | ✘ | video | ||
llava_next_video_hf | llava_next_video_hf | transformers>=4.42, av | ✘ | video | ||
llava_next_video_hf | llava_next_video_hf | transformers>=4.42, av | ✘ | video | ||
llava_next_video_yi_hf | llava_next_video_hf | transformers>=4.42, av | ✘ | video | ||
llava_onevision_hf | llava_onevision_hf | transformers>=4.45 | ✘ | vision, video | ||
llava_onevision_hf | llava_onevision_hf | transformers>=4.45 | ✘ | vision, video | ||
llava_onevision_hf | llava_onevision_hf | transformers>=4.45 | ✘ | vision, video | ||
yi_vl | yi_vl | transformers>=4.34 | ✘ | vision | ||
yi_vl | yi_vl | transformers>=4.34 | ✘ | vision | ||
ernie_vl | ernie_vl | transformers>=4.52, moviepy | ✘ | - | ||
ernie_vl | ernie_vl | transformers>=4.52, moviepy | ✘ | - | ||
ernie_vl | ernie_vl | transformers>=4.52, moviepy | ✘ | - | ||
ernie_vl | ernie_vl | transformers>=4.52, moviepy | ✘ | - | ||
ernie_vl_thinking | ernie_vl_thinking | transformers>=4.52, moviepy | ✘ | - | ||
llava_llama3_1_hf | llava_llama3_1_hf | transformers>=4.41 | ✘ | vision | - | |
llava_llama3_hf | llava_llama3_hf | transformers>=4.36 | ✘ | vision | ||
llava1_6_mistral | llava1_6_mistral | transformers>=4.34 | ✘ | vision | ||
llava1_6_yi | llava1_6_yi | transformers>=4.34 | ✘ | vision | ||
llava_next_qwen | llava_next_qwen | transformers>=4.42, av | ✘ | vision | ||
llava_next_qwen | llava_next_qwen | transformers>=4.42, av | ✘ | vision | ||
llama3_llava_next | llama3_llava_next | transformers>=4.42, av | ✘ | vision | ||
llava_onevision1_5 | llava_onevision1_5 | transformers>=4.53.0, qwen_vl_utils | ✘ | vision | ||
llava_onevision1_5 | llava_onevision1_5 | transformers>=4.53.0, qwen_vl_utils | ✘ | vision | ||
llava_onevision1_5 | llava_onevision1_5 | transformers>=4.53.0, qwen_vl_utils | ✘ | vision | ||
llava_onevision1_5 | llava_onevision1_5 | transformers>=4.53.0, qwen_vl_utils | ✘ | vision | ||
deepseek_vl | deepseek_vl | - | ✘ | vision | ||
deepseek_vl | deepseek_vl | - | ✘ | vision | ||
deepseek_vl2 | deepseek_vl2 | transformers<4.42 | ✘ | vision | ||
deepseek_vl2 | deepseek_vl2 | transformers<4.42 | ✘ | vision | ||
deepseek_vl2 | deepseek_vl2 | transformers<4.42 | ✘ | vision | ||
deepseek_janus | deepseek_janus | - | ✘ | vision | ||
deepseek_janus_pro | deepseek_janus_pro | - | ✘ | vision | ||
deepseek_janus_pro | deepseek_janus_pro | - | ✘ | vision | ||
deepseek_ocr | deepseek_ocr | transformers==4.46.3, easydict | ✘ | vision | ||
minicpmv | minicpmv | timm, transformers<4.42 | ✘ | vision | ||
minicpmv | minicpmv | timm, transformers<4.42 | ✘ | vision | ||
minicpmv2_5 | minicpmv2_5 | timm, transformers>=4.36 | ✘ | vision | ||
minicpmv2_6 | minicpmv2_6 | timm, transformers>=4.36, decord | ✘ | vision, video | ||
minicpmo2_6 | minicpmo2_6 | timm, transformers>=4.36, decord, soundfile | ✘ | vision, video, omni, audio | ||
minicpmv4 | minicpmv4 | timm, transformers>=4.36, decord | ✘ | vision, video | ||
minicpmv4_5 | minicpmv4_5 | timm, transformers>=4.36, decord | ✘ | vision, video | ||
minimax_vl | minimax_vl | - | ✘ | vision | ||
mplug_owl2 | mplug_owl2 | transformers<4.35, icecream | ✘ | vision | ||
mplug_owl2_1 | mplug_owl2 | transformers<4.35, icecream | ✘ | vision | ||
mplug_owl3 | mplug_owl3 | transformers>=4.36, icecream, decord | ✘ | vision, video | ||
mplug_owl3 | mplug_owl3 | transformers>=4.36, icecream, decord | ✘ | vision, video | ||
mplug_owl3 | mplug_owl3 | transformers>=4.36, icecream, decord | ✘ | vision, video | ||
mplug_owl3_241101 | mplug_owl3_241101 | transformers>=4.36, icecream | ✘ | vision, video | ||
doc_owl2 | doc_owl2 | transformers>=4.36, icecream | ✘ | vision | ||
emu3_gen | emu3_gen | - | ✘ | t2i | ||
emu3_chat | emu3_chat | transformers>=4.44.0 | ✘ | vision | ||
got_ocr2 | got_ocr2 | - | ✘ | vision | ||
got_ocr2_hf | got_ocr2_hf | - | ✘ | vision | ||
step_audio | step_audio | funasr, sox, conformer, openai-whisper, librosa | ✘ | audio | ||
step_audio2_mini | step_audio2_mini | transformers==4.53.3, torchaudio, librosa | ✘ | audio | ||
kimi_vl | kimi_vl | transformers<4.49 | ✔ | - | ||
kimi_vl | kimi_vl | transformers<4.49 | ✔ | - | ||
kimi_vl | kimi_vl | transformers<4.49 | ✔ | - | ||
keye_vl | keye_vl | keye_vl_utils | ✘ | vision | ||
keye_vl_1_5 | keye_vl_1_5 | keye_vl_utils>=1.5.2 | ✘ | vision | ||
dots_ocr | dots_ocr | transformers>=4.51.0 | ✘ | - | ||
sail_vl2 | sail_vl2 | transformers<=4.51.3 | ✘ | vision | ||
sail_vl2 | sail_vl2 | transformers<=4.51.3 | ✘ | vision | ||
sail_vl2 | sail_vl2 | transformers<=4.51.3 | ✘ | vision | ||
sail_vl2 | sail_vl2 | transformers<=4.51.3 | ✘ | vision | ||
phi3_vision | phi3_vision | transformers>=4.36 | ✘ | vision | ||
phi3_vision | phi3_vision | transformers>=4.36 | ✘ | vision | ||
phi4_multimodal | phi4_multimodal | transformers>=4.36,<4.49, backoff, soundfile | ✘ | vision, audio | ||
florence | florence | - | ✘ | vision | ||
florence | florence | - | ✘ | vision | ||
florence | florence | - | ✘ | vision | ||
florence | florence | - | ✘ | vision | ||
idefics3 | idefics3 | transformers>=4.45 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
paligemma | paligemma | transformers>=4.41 | ✘ | vision | ||
molmo | molmo | transformers>=4.45 | ✘ | vision | ||
molmo | molmo | transformers>=4.45 | ✘ | vision | ||
molmo | molmo | transformers>=4.45 | ✘ | vision | ||
molmoe | molmo | transformers>=4.45 | ✘ | vision | ||
pixtral | pixtral | transformers>=4.45 | ✘ | vision | ||
megrez_omni | megrez_omni | - | ✘ | vision, audio | ||
valley | valley | transformers>=4.42, av | ✘ | vision | - | |
gemma3_vision | gemma3_vision | transformers>=4.49 | ✘ | - | ||
gemma3_vision | gemma3_vision | transformers>=4.49 | ✘ | - | ||
gemma3_vision | gemma3_vision | transformers>=4.49 | ✘ | - | ||
gemma3_vision | gemma3_vision | transformers>=4.49 | ✘ | - | ||
gemma3_vision | gemma3_vision | transformers>=4.49 | ✘ | - | ||
gemma3_vision | gemma3_vision | transformers>=4.49 | ✘ | - | ||
gemma3n | gemma3n | transformers>=4.53.1 | ✘ | - | ||
gemma3n | gemma3n | transformers>=4.53.1 | ✘ | - | ||
gemma3n | gemma3n | transformers>=4.53.1 | ✘ | - | ||
gemma3n | gemma3n | transformers>=4.53.1 | ✘ | - | ||
mistral_2503 | mistral_2503 | transformers>=4.49 | ✘ | - | ||
mistral_2503 | mistral_2503 | transformers>=4.49 | ✘ | - | ||
mistral_2506 | mistral_2506 | transformers>=4.49 | ✘ | - | ||
paddle_ocr | paddle_ocr | - | ✘ | - | ||
jina_reranker_m0 | jina_reranker_m0 | - | ✘ | reranker, vision |
Datasets
The table below introduces information about the datasets integrated with ms-swift:
Dataset ID: ModelScope dataset ID
HF Dataset ID: Hugging Face dataset ID
Subset Name: Name of the subset
Dataset Size: Size of the dataset
Statistic: The statistical count of the dataset. We use the number of tokens for statistics, which helps in adjusting the
max_lengthhyperparameter. We tokenize the dataset using the tokenizer of qwen2.5. The token count varies with different tokenizers. If you need to obtain token statistics for tokenizers of other models, you can acquire it using the script.Tags: Tags associated with the dataset
| Dataset ID | Subset Name | Dataset Size | Statistic (token) | Tags | HF Dataset ID |
|---|---|---|---|---|---|
| AI-MO/NuminaMath-1.5 | default | 896215 | 116.1±80.8, min=31, max=5064 | grpo, math | AI-MO/NuminaMath-1.5 |
| AI-MO/NuminaMath-CoT | default | 859494 | 113.1±60.2, min=35, max=2120 | grpo, math | AI-MO/NuminaMath-CoT |
| AI-MO/NuminaMath-TIR | default | 72441 | 100.9±52.2, min=36, max=1683 | grpo, math, 🔥 | AI-MO/NuminaMath-TIR |
| AI-ModelScope/COIG-CQIA | chinese_traditional coig_pc exam finance douban human_value logi_qa ruozhiba segmentfault wiki wikihow xhs zhihu | 44694 | 331.2±693.8, min=34, max=19288 | general, 🔥 | - |
| AI-ModelScope/CodeAlpaca-20k | default | 20022 | 99.3±57.6, min=30, max=857 | code, en | HuggingFaceH4/CodeAlpaca_20K |
| AI-ModelScope/DISC-Law-SFT | default | 166758 | 1799.0±474.9, min=769, max=3151 | chat, law, 🔥 | ShengbinYue/DISC-Law-SFT |
| AI-ModelScope/DISC-Med-SFT | default | 464885 | 426.5±178.7, min=110, max=1383 | chat, medical, 🔥 | Flmc/DISC-Med-SFT |
| AI-ModelScope/Duet-v0.5 | default | 5000 | 1157.4±189.3, min=657, max=2344 | CoT, en | G-reen/Duet-v0.5 |
| AI-ModelScope/GuanacoDataset | default | 31563 | 250.3±70.6, min=95, max=987 | chat, zh | JosephusCheung/GuanacoDataset |
| AI-ModelScope/LLaVA-Instruct-150K | default | 623302 | 630.7±143.0, min=301, max=1166 | chat, multi-modal, vision | - |
| AI-ModelScope/LLaVA-Pretrain | default | huge dataset | - | chat, multi-modal, quality | liuhaotian/LLaVA-Pretrain |
| AI-ModelScope/LaTeX_OCR | default human_handwrite human_handwrite_print synthetic_handwrite small | 162149 | 117.6±44.9, min=41, max=312 | chat, ocr, multi-modal, vision | linxy/LaTeX_OCR |
| AI-ModelScope/LongAlpaca-12k | default | 11998 | 9941.8±3417.1, min=4695, max=25826 | long-sequence, QA | Yukang/LongAlpaca-12k |
| AI-ModelScope/M3IT | coco vqa-v2 shapes shapes-rephrased coco-goi-rephrased snli-ve snli-ve-rephrased okvqa a-okvqa viquae textcap docvqa science-qa imagenet imagenet-open-ended imagenet-rephrased coco-goi clevr clevr-rephrased nlvr coco-itm coco-itm-rephrased vsr vsr-rephrased mocheg mocheg-rephrased coco-text fm-iqa activitynet-qa msrvtt ss coco-cn refcoco refcoco-rephrased multi30k image-paragraph-captioning visual-dialog visual-dialog-rephrased iqa vcr visual-mrc ivqa msrvtt-qa msvd-qa gqa text-vqa ocr-vqa st-vqa flickr8k-cn | huge dataset | - | chat, multi-modal, vision | - |
| AI-ModelScope/MATH-lighteval | default | 7500 | 104.4±92.8, min=36, max=1683 | grpo, math | DigitalLearningGmbH/MATH-lighteval |
| AI-ModelScope/Magpie-Qwen2-Pro-200K-Chinese | default | 200000 | 448.4±223.5, min=87, max=4098 | chat, sft, 🔥, zh | Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese |
| AI-ModelScope/Magpie-Qwen2-Pro-200K-English | default | 200000 | 609.9±277.1, min=257, max=4098 | chat, sft, 🔥, en | Magpie-Align/Magpie-Qwen2-Pro-200K-English |
| AI-ModelScope/Magpie-Qwen2-Pro-300K-Filtered | default | 300000 | 556.6±288.6, min=175, max=4098 | chat, sft, 🔥 | Magpie-Align/Magpie-Qwen2-Pro-300K-Filtered |
| AI-ModelScope/MathInstruct | default | 262040 | 253.3±177.4, min=42, max=2193 | math, cot, en, quality | TIGER-Lab/MathInstruct |
| AI-ModelScope/MovieChat-1K-test | default | 162 | 39.7±2.0, min=32, max=43 | chat, multi-modal, video | Enxin/MovieChat-1K-test |
| AI-ModelScope/Open-Platypus | default | 24926 | 389.0±256.4, min=55, max=3153 | chat, math, quality | garage-bAInd/Open-Platypus |
| AI-ModelScope/OpenO1-SFT | default | 125894 | 1080.7±622.9, min=145, max=11637 | chat, general, o1 | O1-OPEN/OpenO1-SFT |
| AI-ModelScope/OpenOrca | default 3_5M | huge dataset | - | chat, multilingual, general | - |
| AI-ModelScope/OpenOrca-Chinese | default | huge dataset | - | QA, zh, general, quality | yys/OpenOrca-Chinese |
| AI-ModelScope/SFT-Nectar | default | 131201 | 441.9±307.0, min=45, max=3136 | cot, en, quality | AstraMindAI/SFT-Nectar |
| AI-ModelScope/ShareGPT-4o | image_caption | 57289 | 599.8±140.4, min=214, max=1932 | vqa, multi-modal | OpenGVLab/ShareGPT-4o |
| AI-ModelScope/ShareGPT4V | ShareGPT4V ShareGPT4V-PT | huge dataset | - | chat, multi-modal, vision | - |
| AI-ModelScope/SkyPile-150B | default | huge dataset | - | pretrain, quality, zh | Skywork/SkyPile-150B |
| AI-ModelScope/WizardLM_evol_instruct_V2_196k | default | 109184 | 483.3±338.4, min=27, max=3735 | chat, en | WizardLM/WizardLM_evol_instruct_V2_196k |
| AI-ModelScope/alpaca-cleaned | default | 51760 | 170.1±122.9, min=29, max=1028 | chat, general, bench, quality | yahma/alpaca-cleaned |
| AI-ModelScope/alpaca-gpt4-data-en | default | 52002 | 167.6±123.9, min=29, max=607 | chat, general, 🔥 | vicgalle/alpaca-gpt4 |
| AI-ModelScope/alpaca-gpt4-data-zh | default | 48818 | 157.2±93.2, min=27, max=544 | chat, general, 🔥 | llm-wizard/alpaca-gpt4-data-zh |
| AI-ModelScope/blossom-math-v2 | default | 10000 | 175.4±59.1, min=35, max=563 | chat, math, 🔥 | Azure99/blossom-math-v2 |
| AI-ModelScope/captcha-images | default | 8000 | 47.0±0.0, min=47, max=47 | chat, multi-modal, vision | - |
| AI-ModelScope/chartqa_digit_r1v_format | default | 11399 | 48.3±5.1, min=37, max=82 | grpo | zyang39/chartqa_digit_r1v_format |
| AI-ModelScope/clevr_cogen_a_train | default | 70000 | 67.0±0.0, min=67, max=67 | qa, math, vision, grpo | leonardPKU/clevr_cogen_a_train |
| AI-ModelScope/coco | default | huge dataset | - | multi-modal, en, vqa, quality | detection-datasets/coco |
| AI-ModelScope/databricks-dolly-15k | default | 15011 | 199.0±268.8, min=26, max=5987 | multi-task, en, quality | databricks/databricks-dolly-15k |
| AI-ModelScope/deepctrl-sft-data | default en | huge dataset | - | chat, general, sft, multi-round | - |
| AI-ModelScope/egoschema | default cls | 101 | 191.6±80.7, min=96, max=435 | chat, multi-modal, video | lmms-lab/egoschema |
| AI-ModelScope/firefly-train-1.1M | default | 1649399 | 204.3±365.3, min=28, max=9306 | chat, general | YeungNLP/firefly-train-1.1M |
| AI-ModelScope/function-calling-chatml | default | 112958 | 465.3±320.1, min=36, max=6106 | agent, en, sft, 🔥 | Locutusque/function-calling-chatml |
| AI-ModelScope/generated_chat_0.4M | default | 396004 | 272.7±51.1, min=78, max=579 | chat, character-dialogue | BelleGroup/generated_chat_0.4M |
| AI-ModelScope/guanaco_belle_merge_v1.0 | default | 693987 | 133.8±93.5, min=30, max=1872 | QA, zh | Chinese-Vicuna/guanaco_belle_merge_v1.0 |
| AI-ModelScope/hh-rlhf | helpful-base helpful-online helpful-rejection-sampled | huge dataset | - | rlhf, dpo | - |
| AI-ModelScope/hh_rlhf_cn | hh_rlhf harmless_base_cn harmless_base_en helpful_base_cn helpful_base_en | 362909 | 142.3±107.5, min=25, max=1571 | rlhf, dpo, 🔥 | - |
| AI-ModelScope/lawyer_llama_data | default | 21476 | 224.4±83.9, min=69, max=832 | chat, law | Skepsun/lawyer_llama_data |
| AI-ModelScope/leetcode-solutions-python | default | 2359 | 723.8±233.5, min=259, max=2117 | chat, coding, 🔥 | - |
| AI-ModelScope/lmsys-chat-1m | default | 166211 | 545.8±3272.8, min=22, max=219116 | chat, em | lmsys/lmsys-chat-1m |
| AI-ModelScope/math-trn-format | default | 11500 | 102.2±88.9, min=36, max=1683 | math | - |
| AI-ModelScope/ms_agent_for_agentfabric | default addition | 30000 | 615.7±198.7, min=251, max=2055 | chat, agent, multi-round, 🔥 | - |
| AI-ModelScope/orpo-dpo-mix-40k | default | 43666 | 938.1±694.2, min=36, max=8483 | dpo, orpo, en, quality | mlabonne/orpo-dpo-mix-40k |
| AI-ModelScope/pile | default | huge dataset | - | pretrain | EleutherAI/pile |
| AI-ModelScope/ruozhiba | post-annual title-good title-norm | 85658 | 40.0±18.3, min=22, max=559 | pretrain, 🔥 | - |
| AI-ModelScope/school_math_0.25M | default | 248481 | 158.8±73.4, min=39, max=980 | chat, math, quality | BelleGroup/school_math_0.25M |
| AI-ModelScope/sharegpt_gpt4 | default V3_format zh_38K_format | 103329 | 3476.6±5959.0, min=33, max=115132 | chat, multilingual, general, multi-round, gpt4, 🔥 | - |
| AI-ModelScope/sql-create-context | default | 78577 | 82.7±31.5, min=36, max=282 | chat, sql, 🔥 | b-mc2/sql-create-context |
| AI-ModelScope/stack-exchange-paired | default | huge dataset | - | hfrl, dpo, pairwise | lvwerra/stack-exchange-paired |
| AI-ModelScope/starcoderdata | default | huge dataset | - | pretrain, quality | bigcode/starcoderdata |
| AI-ModelScope/synthetic_text_to_sql | default | 100000 | 221.8±69.9, min=64, max=616 | nl2sql, en | gretelai/synthetic_text_to_sql |
| AI-ModelScope/texttosqlv2_25000_v2 | default | 25000 | 277.3±328.3, min=40, max=1971 | chat, sql | Clinton/texttosqlv2_25000_v2 |
| AI-ModelScope/the-stack | default | huge dataset | - | pretrain, quality | bigcode/the-stack |
| AI-ModelScope/tigerbot-law-plugin | default | 55895 | 104.9±51.0, min=43, max=1087 | text-generation, law, pretrained | TigerResearch/tigerbot-law-plugin |
| AI-ModelScope/train_0.5M_CN | default | 519255 | 128.4±87.4, min=31, max=936 | common, zh, quality | BelleGroup/train_0.5M_CN |
| AI-ModelScope/train_1M_CN | default | huge dataset | - | common, zh, quality | BelleGroup/train_1M_CN |
| AI-ModelScope/train_2M_CN | default | huge dataset | - | common, zh, quality | BelleGroup/train_2M_CN |
| AI-ModelScope/tulu-v2-sft-mixture | default | 326154 | 523.3±439.3, min=68, max=2549 | chat, multilingual, general, multi-round | allenai/tulu-v2-sft-mixture |
| AI-ModelScope/ultrafeedback-binarized-preferences-cleaned-kto | default | 230720 | 471.5±274.3, min=27, max=2232 | rlhf, kto | - |
| AI-ModelScope/webnovel_cn | default | 50000 | 1455.2±12489.4, min=524, max=490480 | chat, novel | zxbsmk/webnovel_cn |
| AI-ModelScope/wikipedia-cn-20230720-filtered | default | huge dataset | - | pretrain, quality | pleisto/wikipedia-cn-20230720-filtered |
| AI-ModelScope/zhihu_rlhf_3k | default | 3460 | 594.5±365.9, min=31, max=1716 | rlhf, dpo, zh | liyucheng/zhihu_rlhf_3k |
| DAMO_NLP/jd | default cls | 45012 | 66.9±87.0, min=41, max=1699 | text-generation, classification, 🔥 | - |
| FreedomIntelligence/medical-o1-reasoning-SFT | en zh | 50143 | 98.0±53.6, min=36, max=1508 | medical, o1, 🔥 | FreedomIntelligence/medical-o1-reasoning-SFT |
| - | default | huge dataset | - | pretrain, quality | HuggingFaceFW/fineweb |
| - | auto_math_text khanacademy openstax stanford stories web_samples_v1 web_samples_v2 wikihow | huge dataset | - | multi-domain, en, qa | HuggingFaceTB/cosmopedia |
| HumanLLMs/Human-Like-DPO-Dataset | default | 10884 | 47.5±7.9, min=32, max=85 | rlhf, dpo | HumanLLMs/Human-Like-DPO-Dataset |
| LLM-Research/xlam-function-calling-60k | default grpo | 120000 | 453.7±219.5, min=164, max=2779 | agent, grpo, 🔥 | Salesforce/xlam-function-calling-60k |
| MTEB/scidocs-reranking | default | 39193 | 41.9±5.8, min=31, max=107 | rerank, 🔥 | mteb/scidocs-reranking |
| MTEB/stackoverflowdupquestions-reranking | default | 26485 | 39.9±4.6, min=31, max=77 | rerank, 🔥 | mteb/stackoverflowdupquestions-reranking |
| OmniData/Zhihu-KOL | default | huge dataset | - | zhihu, qa | wangrui6/Zhihu-KOL |
| OmniData/Zhihu-KOL-More-Than-100-Upvotes | default | 271261 | 1003.4±1826.1, min=28, max=52541 | zhihu, qa | bzb2023/Zhihu-KOL-More-Than-100-Upvotes |
| PowerInfer/LONGCOT-Refine-500K | default | 521921 | 296.5±158.4, min=39, max=4634 | chat, sft, 🔥, cot | PowerInfer/LONGCOT-Refine-500K |
| PowerInfer/QWQ-LONGCOT-500K | default | 498082 | 310.7±303.1, min=35, max=22941 | chat, sft, 🔥, cot | PowerInfer/QWQ-LONGCOT-500K |
| ServiceNow-AI/R1-Distill-SFT | v0 v1 | 1850809 | 164.2±438.0, min=30, max=32469 | chat, sft, cot, r1 | ServiceNow-AI/R1-Distill-SFT |
| TIGER-Lab/MATH-plus | train | 893929 | 301.4±196.7, min=50, max=1162 | qa, math, en, quality | TIGER-Lab/MATH-plus |
| Tongyi-DataEngine/SA1B-Dense-Caption | default | huge dataset | - | zh, multi-modal, vqa | - |
| Tongyi-DataEngine/SA1B-Paired-Captions-Images | default | 7736284 | 106.4±18.5, min=48, max=193 | zh, multi-modal, vqa | - |
| YorickHe/CoT | default | 74771 | 141.6±45.5, min=58, max=410 | chat, general | - |
| YorickHe/CoT_zh | default | 74771 | 129.1±53.2, min=51, max=401 | chat, general | - |
| ZhipuAI/LongWriter-6k | default | 6000 | 5009.0±2932.8, min=117, max=30354 | long, chat, sft, 🔥 | zai-org/LongWriter-6k |
| - | default | huge dataset | - | pretrain, quality | allenai/c4 |
| bespokelabs/Bespoke-Stratos-17k | default | 16710 | 480.7±236.1, min=266, max=3556 | chat, sft, cot, r1 | bespokelabs/Bespoke-Stratos-17k |
| - | default | huge dataset | - | pretrain, quality | cerebras/SlimPajama-627B |
| clip-benchmark/wds_voc2007_multilabel | default | 2501 | 112.0±0.0, min=112, max=112 | multilabel, multi-modal | clip-benchmark/wds_voc2007_multilabel |
| codefuse-ai/CodeExercise-Python-27k | default | 27224 | 337.3±154.2, min=90, max=2826 | chat, coding, 🔥 | - |
| codefuse-ai/Evol-instruction-66k | default | 66862 | 440.1±208.4, min=46, max=2661 | chat, coding, 🔥 | - |
| damo/MSAgent-Bench | default mini | 638149 | 859.2±460.1, min=38, max=3479 | chat, agent, multi-round | - |
| damo/nlp_polylm_multialpaca_sft | ar de es fr id ja ko pt ru th vi | 131867 | 101.6±42.5, min=30, max=1029 | chat, general, multilingual | - |
| damo/zh_cls_fudan-news | default | 4959 | 3234.4±2547.5, min=91, max=19548 | chat, classification | - |
| damo/zh_ner-JAVE | default | 1266 | 118.3±45.5, min=44, max=223 | chat, ner | - |
| hjh0119/shareAI-Llama3-DPO-zh-en-emoji | default | 2449 | 334.0±162.8, min=36, max=1801 | rlhf, dpo | shareAI/DPO-zh-en-emoji |
| huangjintao/AgentInstruct_copy | alfworld db kg mind2web os webshop | 1866 | 1144.3±635.5, min=206, max=6412 | chat, agent, multi-round | - |
| iic/100PoisonMpts | default | 906 | 150.6±80.8, min=39, max=656 | poison-management, zh | - |
| iic/DocQA-RL-1.6K | default | 1591 | 8307.3±7748.9, min=202, max=32563 | docqa, rl, long-sequence | Tongyi-Zhiwen/DocQA-RL-1.6K |
| iic/MSAgent-MultiRole | default | 543 | 413.0±79.7, min=70, max=936 | chat, agent, multi-round, role-play, multi-agent | - |
| iic/MSAgent-Pro | default | 21910 | 1978.1±747.9, min=339, max=8064 | chat, agent, multi-round, 🔥 | - |
| iic/ms_agent | default | 30000 | 645.8±218.0, min=199, max=2070 | chat, agent, multi-round, 🔥 | - |
| iic/ms_bench | default | 316820 | 353.4±424.5, min=29, max=2924 | chat, general, multi-round, 🔥 | - |
| liucong/Chinese-DeepSeek-R1-Distill-data-110k-SFT | default | 110000 | 72.1±60.9, min=29, max=2315 | chat, sft, cot, r1, 🔥 | Congliu/Chinese-DeepSeek-R1-Distill-data-110k-SFT |
| - | default | huge dataset | - | multi-modal, en, vqa, quality | lmms-lab/GQA |
| - | 0_30_s_academic_v0_1 0_30_s_youtube_v0_1 1_2_m_academic_v0_1 1_2_m_youtube_v0_1 2_3_m_academic_v0_1 2_3_m_youtube_v0_1 30_60_s_academic_v0_1 30_60_s_youtube_v0_1 | 1335486 | 273.7±78.8, min=107, max=638 | chat, multi-modal, video | lmms-lab/LLaVA-Video-178K |
| lmms-lab/multimodal-open-r1-8k-verified | default | 7689 | 74.0±24.8, min=41, max=214 | grpo, vision, 🔥 | lmms-lab/multimodal-open-r1-8k-verified |
| lvjianjin/AdvertiseGen | default | 97484 | 130.9±21.9, min=73, max=232 | text-generation, 🔥 | shibing624/AdvertiseGen |
| mapjack/openwebtext_dataset | default | huge dataset | - | pretrain, zh, quality | - |
| modelscope/DuReader_robust-QG | default | 17899 | 242.0±143.1, min=75, max=1416 | text-generation, 🔥 | - |
| modelscope/MathR | default clean | 6089 | 188.7±75.3, min=64, max=3341 | qa, math | - |
| modelscope/MathR-32B-Distill | data | 25921 | 209.4±63.1, min=121, max=3407 | qa, math | - |
| modelscope/chinese-poetry-collection | default | 1710 | 58.1±8.1, min=31, max=71 | text-generation, poetry | - |
| modelscope/clue | cmnli | 391783 | 81.6±16.0, min=54, max=157 | text-generation, classification | clue |
| modelscope/coco_2014_caption | train validation | 454617 | 389.6±68.4, min=70, max=587 | chat, multi-modal, vision, 🔥 | - |
| modelscope/gsm8k | main | 7473 | 88.6±21.6, min=41, max=241 | qa, math | - |
| open-r1/DAPO-Math-17k-Processed | all | 17398 | 122.3±65.2, min=41, max=1517 | math, rlvr | open-r1/DAPO-Math-17k-Processed |
| open-r1/verifiable-coding-problems-python | default | 35735 | 559.0±255.2, min=74, max=6191 | grpo, code | open-r1/verifiable-coding-problems-python |
| open-r1/verifiable-coding-problems-python-10k | default | 1800 | 581.6±233.4, min=136, max=2022 | grpo, code | open-r1/verifiable-coding-problems-python-10k |
| open-r1/verifiable-coding-problems-python-10k_decontaminated | default | 1574 | 575.7±234.3, min=136, max=2022 | grpo, code | open-r1/verifiable-coding-problems-python-10k_decontaminated |
| open-r1/verifiable-coding-problems-python_decontaminated | default | 27839 | 561.9±252.2, min=74, max=6191 | grpo, code | open-r1/verifiable-coding-problems-python_decontaminated |
| open-thoughts/OpenThoughts-114k | default | 113957 | 413.2±186.9, min=265, max=13868 | chat, sft, cot, r1 | open-thoughts/OpenThoughts-114k |
| swift/self-cognition | default qwen3 empty_think | 108 | 58.9±20.3, min=32, max=131 | chat, self-cognition, 🔥 | modelscope/self-cognition |
| sentence-transformers/stsb | default positive generate reg | 5748 | 21.0±0.0, min=21, max=21 | similarity, 🔥 | sentence-transformers/stsb |
| shenweizhou/alpha-umi-toolbench-processed-v2 | backbone caller planner summarizer | huge dataset | - | chat, agent, 🔥 | - |
| simpleai/HC3 | finance finance_cls medicine medicine_cls | 11021 | 296.0±153.3, min=65, max=2267 | text-generation, classification, 🔥 | Hello-SimpleAI/HC3 |
| simpleai/HC3-Chinese | baike baike_cls open_qa open_qa_cls nlpcc_dbqa nlpcc_dbqa_cls finance finance_cls medicine medicine_cls law law_cls psychology psychology_cls | 39781 | 179.9±70.2, min=90, max=1070 | text-generation, classification, 🔥 | Hello-SimpleAI/HC3-Chinese |
| speech_asr/speech_asr_aishell1_trainsets | train validation test | 141600 | 40.8±3.3, min=33, max=53 | chat, multi-modal, audio | - |
| swift/A-OKVQA | default | 18201 | 43.5±7.9, min=27, max=94 | multi-modal, en, vqa, quality | HuggingFaceM4/A-OKVQA |
| swift/ChartQA | default | 28299 | 36.8±6.5, min=26, max=74 | en, vqa, quality | HuggingFaceM4/ChartQA |
| swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT | default | 110000 | 72.1±60.9, min=29, max=2315 | 🔥, distill, sft | - |
| swift/Chinese-Qwen3-235B-Thinking-2507-Distill-data-110k-SFT | default | 110000 | 72.1±60.9, min=29, max=2315 | 🔥, distill, sft, cot, r1, thinking | - |
| swift/GRIT | caption grounding vqa | huge dataset | - | multi-modal, en, caption-grounding, vqa, quality | zzliang/GRIT |
| swift/GenQA | default | huge dataset | - | qa, quality, multi-task | tomg-group-umd/GenQA |
| swift/Infinity-Instruct | 3M 7M 0625 Gen 7M_domains | huge dataset | - | qa, quality, multi-task | BAAI/Infinity-Instruct |
| swift/Mantis-Instruct | birds-to-words chartqa coinstruct contrastive_caption docvqa dreamsim dvqa iconqa imagecode llava_665k_multi lrv_multi multi_vqa nextqa nlvr2 spot-the-diff star visual_story_telling | 988115 | 619.9±156.6, min=243, max=1926 | chat, multi-modal, vision | - |
| swift/MideficsDataset | default | 3800 | 201.3±70.2, min=60, max=454 | medical, en, vqa | WinterSchool/MideficsDataset |
| swift/Multimodal-Mind2Web | default | 1009 | 293855.4±331149.5, min=11301, max=3577519 | agent, multi-modal | osunlp/Multimodal-Mind2Web |
| swift/OCR-VQA | default | 186753 | 32.3±5.8, min=27, max=80 | multi-modal, en, ocr-vqa | howard-hou/OCR-VQA |
| swift/OK-VQA_train | default | 9009 | 31.7±3.4, min=25, max=56 | multi-modal, en, vqa, quality | Multimodal-Fatima/OK-VQA_train |
| swift/OpenHermes-2.5 | default | huge dataset | - | cot, en, quality | teknium/OpenHermes-2.5 |
| swift/RLAIF-V-Dataset | default | 83132 | 99.6±54.8, min=30, max=362 | rlhf, dpo, multi-modal, en | openbmb/RLAIF-V-Dataset |
| swift/RedPajama-Data-1T | default | huge dataset | - | pretrain, quality | togethercomputer/RedPajama-Data-1T |
| swift/RedPajama-Data-V2 | default | huge dataset | - | pretrain, quality | togethercomputer/RedPajama-Data-V2 |
| swift/ScienceQA | default | 16967 | 101.7±55.8, min=32, max=620 | multi-modal, science, vqa, quality | derek-thomas/ScienceQA |
| swift/SlimOrca | default | 517982 | 405.5±442.1, min=47, max=8312 | quality, en | Open-Orca/SlimOrca |
| swift/TextCaps | default emb rerank | huge dataset | - | multi-modal, en, caption, quality | HuggingFaceM4/TextCaps |
| swift/ToolBench | default | 124345 | 2251.7±1039.8, min=641, max=9451 | chat, agent, multi-round | - |
| swift/VQAv2 | default | huge dataset | - | en, vqa, quality | HuggingFaceM4/VQAv2 |
| swift/VideoChatGPT | Generic Temporal Consistency | 3206 | 87.4±48.3, min=31, max=398 | chat, multi-modal, video, 🔥 | lmms-lab/VideoChatGPT |
| swift/WebInstructSub | default | huge dataset | - | qa, en, math, quality, multi-domain, science | TIGER-Lab/WebInstructSub |
| swift/aya_collection | aya_dataset | 202364 | 474.6±1539.1, min=25, max=71312 | multi-lingual, qa | CohereForAI/aya_collection |
| swift/chinese-c4 | default | huge dataset | - | pretrain, zh, quality | shjwudp/chinese-c4 |
| swift/cinepile | default | huge dataset | - | vqa, en, youtube, video | tomg-group-umd/cinepile |
| swift/classical_chinese_translate | default | 6655 | 349.3±77.1, min=61, max=815 | chat, play-ground | - |
| swift/cosmopedia-100k | default | 100000 | 1037.0±254.8, min=339, max=2818 | multi-domain, en, qa | HuggingFaceTB/cosmopedia-100k |
| swift/dolma | v1_7 | huge dataset | - | pretrain, quality | allenai/dolma |
| swift/dolphin | flan1m-alpaca-uncensored flan5m-alpaca-uncensored | huge dataset | - | en | cognitivecomputations/dolphin |
| swift/github-code | default | huge dataset | - | pretrain, quality | codeparrot/github-code |
| swift/gpt4v-dataset | default | huge dataset | - | en, caption, multi-modal, quality | laion/gpt4v-dataset |
| swift/llava-data | llava_instruct | 624255 | 369.7±143.0, min=40, max=905 | sft, multi-modal, quality | TIGER-Lab/llava-data |
| swift/llava-instruct-mix-vsft | default | 13640 | 178.8±119.8, min=34, max=951 | multi-modal, en, vqa, quality | HuggingFaceH4/llava-instruct-mix-vsft |
| swift/llava-med-zh-instruct-60k | default | 56649 | 207.9±67.7, min=42, max=594 | zh, medical, vqa, multi-modal | BUAADreamer/llava-med-zh-instruct-60k |
| swift/lnqa | default | huge dataset | - | multi-modal, en, ocr-vqa, quality | vikhyatk/lnqa |
| swift/longwriter-6k-filtered | default | 666 | 4108.9±2636.9, min=1190, max=17050 | long, chat, sft, 🔥 | - |
| swift/medical_zh | en zh | 2068589 | 256.4±87.3, min=39, max=1167 | chat, medical | - |
| swift/moondream2-coyo-5M-captions | default | huge dataset | - | caption, pretrain, quality | isidentical/moondream2-coyo-5M-captions |
| swift/no_robots | default | 9485 | 300.0±246.2, min=40, max=6739 | multi-task, quality, human-annotated | HuggingFaceH4/no_robots |
| swift/orca_dpo_pairs | default | 12859 | 364.9±248.2, min=36, max=2010 | rlhf, quality | Intel/orca_dpo_pairs |
| swift/path-vqa | default | 19654 | 34.2±6.8, min=28, max=85 | multi-modal, vqa, medical | flaviagiammarino/path-vqa |
| swift/pile-val-backup | default | 214661 | 1831.4±11087.5, min=21, max=516620 | text-generation, awq | mit-han-lab/pile-val-backup |
| swift/pixelprose | default | huge dataset | - | caption, multi-modal, vision | tomg-group-umd/pixelprose |
| swift/refcoco | caption grounding | 92430 | 45.4±3.0, min=37, max=63 | multi-modal, en, grounding | jxu124/refcoco |
| swift/refcocog | caption grounding | 89598 | 50.3±4.6, min=39, max=91 | multi-modal, en, grounding | jxu124/refcocog |
| swift/sharegpt | common-zh unknow-zh common-en | 194063 | 820.5±366.1, min=25, max=2221 | chat, general, multi-round | - |
| swift/swift-sft-mixture | sharegpt firefly codefuse metamathqa | huge dataset | - | chat, sft, general, 🔥 | - |
| swift/tagengo-gpt4 | default | 76437 | 468.1±276.8, min=28, max=1726 | chat, multi-lingual, quality | lightblue/tagengo-gpt4 |
| swift/train_3.5M_CN | default | huge dataset | - | common, zh, quality | BelleGroup/train_3.5M_CN |
| swift/ultrachat_200k | default | 207843 | 1188.0±571.1, min=170, max=4068 | chat, en, quality | HuggingFaceH4/ultrachat_200k |
| swift/wikipedia | default | huge dataset | - | pretrain, quality | wikipedia |
| tany0699/garbage265 | default | 132673 | 39.0±0.0, min=39, max=39 | cls, 🔥, multi-modal | - |
| tastelikefeet/competition_math | default | 12000 | 101.9±87.3, min=36, max=1683 | qa, math | - |
| - | default | huge dataset | - | pretrain, quality | tiiuae/falcon-refinedweb |
| wyj123456/GPT4all | default | 806199 | 97.3±20.9, min=62, max=414 | chat, general | - |
| wyj123456/code_alpaca_en | default | 20022 | 99.3±57.6, min=30, max=857 | chat, coding | sahil2801/CodeAlpaca-20k |
| wyj123456/finance_en | default | 68912 | 264.5±207.1, min=30, max=2268 | chat, financial | ssbuild/alpaca_finance_en |
| wyj123456/instinwild | default subset | 103695 | 125.1±43.7, min=35, max=801 | chat, general | - |
| wyj123456/instruct | default | 888970 | 271.0±333.6, min=34, max=3967 | chat, general | - |
| zouxuhong/Countdown-Tasks-3to4 | default | 490364 | 126.6±2.0, min=122, max=130 | math | - |