A reinforcement-learning-based visual token pruning framework to accelerate inference of Large Vision Language Models (LVLMs).
TPRL formulates visual token pruning as a Markov Decision Process (MDP):
- Learning from Demonstrations (LfD): Generate demonstration trajectories using heuristics and pretrain the policy network.
- PPO Fine-tuning: Fine-tune the policy with Proximal Policy Optimization to jointly optimize task performance and computational efficiency.
- Inference: One-shot pruning that retains the most important visual tokens.
visual input → ViT → Projector → [TPRL pruner] → LLM → output # Clone the repository git clone https://github.com/MagicVicCoder/TPRL.git cd TPRL # Install requirements pip install -r requirements.txtpython train_lfd.py# Set the LfD checkpoint path in config.py first python train_ppo.pypython main.pyTPRL/ ├── model/ │ ├── autoencoder.py # Token compression (optional) │ ├── rl_networks.py # Policy and value networks │ ├── llava_mllm.py # LLaVA model wrapper │ └── qwen_mllm.py # Qwen model wrapper ├── pruner/ │ ├── rl_pruner.py # RL-based pruner │ ├── random_pruner.py # Baseline random pruner │ └── mlp_pruner.py # MLP-based pruner ├── train_lfd.py # LfD training script ├── train_ppo.py # PPO training script ├── config.py # Configuration └── main.py # Evaluation / inference script - State: (visual tokens, text query)
- Action: keep / prune decision for each token
- Reward: downstream task performance + computational efficiency
reward = alpha * task_reward + beta * efficiency_rewardtask_reward: change in task performance (e.g., IoU / accuracy)efficiency_reward: compression / efficiency metric
- Python >= 3.8
- PyTorch >= 2.0
- Transformers >= 4.37.0
- See
requirements.txtfor full dependency list
⭐ If you find this repository useful, please give it a Star!
If you find this work useful, please cite:
@misc{cao2026languageguidedtokencompressionreinforcement, title={Language-Guided Token Compression with Reinforcement Learning in Large Vision-Language Models}, author={Sihan Cao and Jianwei Zhang and Pengcheng Zheng and Jiaxin Yan and Caiyan Qin and Yalan Ye and Wei Dong and Peng Wang and Yang Yang and Chaoning Zhang}, year={2026}, eprint={2603.13394}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2603.13394} }