Skip to content

Add Qwen3-VL dense models#670

Merged
cg123 merged 1 commit intoarcee-ai:mainfrom
pengdurice:peng-add-qwen3-vl-dense
Mar 15, 2026
Merged

Add Qwen3-VL dense models#670
cg123 merged 1 commit intoarcee-ai:mainfrom
pengdurice:peng-add-qwen3-vl-dense

Conversation

@pengdurice
Copy link
Contributor

@pengdurice pengdurice commented Mar 11, 2026

Add support for Qwen3-VL dense models, locally tested with Qwen3-VL-4B-Instruct, Qwen3-VL-8B-Thinking, Qwen3-VL-32B-Instruct.


Note

Medium Risk
Adds a new architecture JSON that affects how MergeKit identifies and maps Qwen3-VL dense model weights; incorrect keys/weight names could break loading or merges for these models. Change is isolated to data/config with no broader code logic modifications.

Overview
Adds a new modular architecture definition qwen3_vl_dense.json to support dense Qwen3-VL models (Qwen3VLForConditionalGeneration).

The definition introduces module weight mappings for the text decoder (dense MLP projections), the multi-modal projector (deepstack merger layers), and the vision tower blocks, along with updated config keys for vision_config.depth and text_config.vocab_size and required tagalong files.

Written by Cursor Bugbot for commit f1f257f. This will update automatically on new commits. Configure here.

Signed-off-by: pengdurice <pengduhit@gmail.com>
@cg123
Copy link
Collaborator

cg123 commented Mar 15, 2026

Thanks for the contribution!

@cg123 cg123 merged commit 7111360 into arcee-ai:main Mar 15, 2026
6 of 8 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 15, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

2 participants