Starred repositories
verl: Volcano Engine Reinforcement Learning for LLMs
Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment
🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
[NeurIPS 2025 D&B🔥] OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
A curated list of papers on reinforcement learning for video generation
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Official code for "VideoReward Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning"
(arXiv) MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
[ICLR 2026] EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
Qwen-Image-Lightning: Speed up Qwen-Image model with distillation
HunyuanImage-3.0: A Powerful Native Multimodal Model for Image Generation
[ICLR 2026] LongLive: Real-time Interactive Long Video Generation
IamCreateAI / FlowCPS
Forked from yifan123/flow_grpoAn official implementation of Coefficients-Preserving Sampling for Reinforcement Learning with Flow Matching
HunyuanImage-2.1: An Efficient Diffusion Model for High-Resolution (2K) Text-to-Image Generation
Pusa: Thousands Timesteps Video Diffusion Model
Official implementation of Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Industry-level video foundation model for unified Text-to-Video (T2V) and Image-to-Video (I2V) generation.
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Wan: Open and Advanced Large-Scale Video Generative Models


