Feat: add model support for penguinvl by taintaintainu · Pull Request #1257 · EvolvingLMMs-Lab/lmms-eval

taintaintainu · 2026-03-17T03:30:10Z

Summary

Add a new simple-model integration for Penguin-VL exposed as --model penguinvl in lmms-eval.
Register penguinvl in the model registry and add an example launch script for multi-benchmark evaluation.

Add lmms_eval/models/simple/penguinvl.py, register the model ID, provide examples/models/penguin_vl.sh, and add penguinvl prompt overrides for mmmu_pro_standard and mmmu_pro_vision.

No new benchmark/task is introduced, and no metric/aggregation logic or dataset definitions are changed outside the Penguin-VL-specific prompt configuration.

accelerate launch --num_processes=8 --main_process_port=12346 -m lmms_eval --model penguinvl --model_args=pretrained=tencent/Penguin-VL-8B,attn_implementation=flash_attention_2,dtype=bfloat16 --tasks "ai2d,mmmu_pro_standard,ocrbench" --batch_size 1 --log_samples --log_samples_suffix penguinvl --verbosity DEBUG --output_path ./logs/ | sample size: N=3088+1730+1000 | key metrics: ai2d exact_match=0.8491, mmmu_pro_standard mmmu_acc=0.32139, ocrbench_accuracy=0.8430 | result: pass
accelerate launch --num_processes=8 --main_process_port=12346 -m lmms_eval --model penguinvl --model_args=pretrained=tencent/Penguin-VL-8B,attn_implementation=flash_attention_2,dtype=bfloat16 --tasks "videomme,longvideobench_val_v" --batch_size 1 --log_samples --log_samples_suffix penguinvl --verbosity DEBUG --output_path ./logs/ | sample size: N=2700+1337 | key metrics: videomme_perception_score=66.30, longvideobench_val_v lvb_acc=0.64996 | result: pass

Runtime compatibility depends on the upstream Penguin-VL Hugging Face implementation; this integration was evaluated with transformers==4.51.3 and attn_implementation=flash_attention_2.

Add support for Penguin-VL

taintaintainu added 2 commits March 17, 2026 10:27

add support for Penguin-VL

04493e7

Merge remote-tracking branch 'origin/main'

139e8d8

Add support for Penguin-VL