Jiaxu Zhang | 张嘉旭

I am a Ph.D. student at Wuhan University, China, under the supervision of Prof. Deren Li and Prof. Zhigang Tu. Since February 2025, I have also been a Joint Ph.D. student at Nanyang Technological University, advised by Prof. Guosheng Lin.

Currently, I am a research intern at ByteDance Seed, focusing on AIGC and MLLM. Previously, I interned at Tencent from 2022 to 2024, and at StepFun from 2024 to 2025, where I was advised by Dr. Gang Yu.

My research interests lie in computer vision and computer graphics, with a particular focus on Multimodal AIGC, 2D/3D character animation, video/motion generation, retargeting, and recognition.

Expected graduation in 2026, open to postdoc and research scientist opportunities.

Email / CV / Google Scholar / Github / WeChat

🥰 🤡

News

[2025/06] 🎉 MikuDance was accepted by ICCV 2025 (Oral).

[2025/01] 🕹️ MikuDance has recently been launched on the Lipu, an AI creation community designed for animation enthusiasts. Feel free to give it a try!

[2024/07] 🎉 One paper gets accepted to ACM MM 2024.

[2024/04] 🕹️ I've released a repository, Freehand-Genshin-Diffusion, that transforms Genshin PVs into a freehand style using the Diffusion Model. Feel free to give it a try!

[2024/04] 🎉 One paper has been accepted by IEEE T-PAMI, which is an extension of our CVPR 2023 paper.

[2024/01] 🎉 One paper gets accepted to ICLR 2024.

[2023/06] 📌 I gave an oral presentation on Virtual Animation Technology at VALSE 2023.

[2023/02] 🎉 One paper gets accepted to CVPR 2023.

Research

My research interests are broadly in 3D/2D Computer Vision and Generative AI. My overarching research objective is to advance AI-driven methods that augment and amplify human creativity. Specifically, I aim to develop intelligent 3D/2D vision and generative models that enable humans to more intuitively create, edit, and control lifelike virtual avatars, scenes, and animations.

	FlowAct-R1: Towards Interactive Humanoid Video Generation FlowAct Team, ByteDance Intelligent Creation Tech Report, 2026 project page / arxiv We present FlowAct-R1, a novel framework that enables lifelike, responsive, and high-fidelity humanoid video generation for seamless real-time interaction.
	Bridging Your Imagination with Audio-Video Generation via a Unified Director Jiaxu Zhang, Tianshu Hu, Yuan Zhang, Zenan Li, Linjie Luo, Guosheng Lin, Xin Chen ArXiv, 2025 project page / arxiv UniMAGE unifies script drafting, extension, continuation, and keyframe image generation, thereby enabling coherent long-form storytelling with consistent characters and cinematic visual compositions. The generated scripts and keyframes can further serve as structured, high-level guidance for existing audio-video joint generation models.
	DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds Jiaxu Zhang, Xianfang Zeng, Xin Chen, Wei Zuo, Gang Yu, Guosheng Lin, Zhigang Tu ArXiv, 2025 project page / code / arxiv We propose DreamDance, a novel paradigm that reformulates the character art animation task into two inpainting based steps: Camera-aware Scene Inpainting for stable scene reconstruction and Pose-aware Video Inpainting for dynamic character animation.
	MikuDance: Animating Character Art with Mixed Motion Dynamics Jiaxu Zhang, Xianfang Zeng, Xin Chen, Wei Zuo, Gang Yu, Zhigang Tu Proceedings of the International Conference on Computer Vision (ICCV, Oral), 2025 project page / code / arxiv We propose MikuDance, a diffusion-based pipeline incorporating mixed motion dynamics to animate stylized character art.
	Freehand-Genshin-Diffusion A project for transforming Genshin PVs into a freehand style using Diffusion Model. I've been exploring 2D image animation recently. This project is purely for fun. Feel free to reach out and discuss this with me.
	A Modular Neural Motion Retargeting System Decoupling Skeleton and Shape Perception Jiaxu Zhang, Zhigang Tu, Junwu Weng, Junsong Yuan, Bo Du IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)*, 2024 code / arxiv M-R2ET is a modular neural motion retargeting system designed to transfer motion between characters with different structures but corresponding to homeomorphic graphs, meanwhile preserving motion semantics and perceiving shape geometries.
	Generative Motion Stylization of Cross-structure Characters within Canonical Motion Space Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang Tu* Proceedings of the 32nd ACM International Conference on Multimedia (ACM MM), 2024 arxiv We present MotionS, a generative motion stylization pipeline for synthesizing diverse and stylized motion on cross-structure source using cross-modality style prompts.
	TapMo: Shape-aware Motion Generation of Skeleton-free Characters Jiaxu Zhang#, Shaoli Huang#, Zhigang Tu, Xin Chen, Xiaohang Zhan, Gang Yu, Ying Shan The Twelfth International Conference on Learning Representations (ICLR)*, 2024 project page / code / arxiv TapMo is a text-based animation pipeline for generating motion in a wide variety of skeleton-free characters.
	Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry Jiaxu Zhang, Junwu Weng, Di Kang, Fang Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan, Jue Wang, Zhigang Tu* Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023 project page / code / arxiv R2ET is a neural motion retargeting model that can preserve the source motion semantics and avoid interpenetration in the target motion.
	Zoom Transformer for Skeleton-based Group Activity Recognition Jiaxu Zhang, Yifan Jia, Wei Xie, and Zhigang Tu* IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), 2022 code / arxiv We propose a novel Zoom Transformer to exploit both the low-level single-person motion information and the high-level multi-person interaction information in a uniform attention structure.
	Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition Zhigang Tu#, Jiaxu Zhang#, Hongyan Li, Yujin Chen, and Junsong Yuan IEEE Transactions on Multimedia (T-MM)*, 2022 code / arxiv we propose a semi-supervised skeleton-based action recognition method.

Experience

	ByteDance Seed 2026.01 - Present, Hangzhou Research Intern for AIGC and MLLM. Advisor: Dr. Tianshu Hu and Dr. Mingyuan Gao
	ByteDance 2025.06 - Present, Shenzhen Research Intern for AIGC and MLLM. Advisor: Dr. Xin Chen and Dr. Tianshu Hu
	StepFun 2024.05 - 2025.06, Shanghai Research Intern for AIGC. Advisor: Dr. Gang Yu and Dr. Xianfang Zeng
	Tencent 2023.06 - 2024.04, Shanghai Research Intern in Tencent PCG. Advisor: Dr. Gang Yu and Dr. Xin Chen 2022.07 - 2023.06, Shenzhen Research Intern in Tencent AI Lab. Advisor: Dr. Junwu Weng and Dr. Shaoli Huang
	Nanyang Technological University (NTU) 2025.02 - Present, Singapore Joint-PhD Student Research Advisor: Prof. Guosheng Lin
	Wuhan University 2020.09 - Present, Wuhan Ph.D Student in LIEMSARS. I received my Master Degree of Computer Technology in 2023. Research Advisor: Prof. Zhigang Tu
	Southeast University 2016.09 - 2020.06, Nanjing I received my B.S Degree of Geographic Information Science in 2020. GPA: 3.88/4.0, Rank: 1/26. 2018.11 - 2020.06, Nanjing Research assistant in Research Center of Complex Transportation Network (TLab).

Awards and Honors

2025: Academic Innovation Award of Wuhan University (15,000RMB¥, Top 1%)

2024: NSFC Basic Research Project for Youth Scholars (300,000RMB¥)

2023: Lei Jun Excellence Scholarship (100,000RMB¥, Top 0.1‰)

2023: Wang Zhizhuo Innovative Talent Award (8,000RMB¥, Top 1%)

2022: National Scholarship (Highest Honor for Master students in China, 10,000RMB¥, Top 3%)

2022: First-class Scholarship of Wuhan University (5,000RMB¥, Top 10%)

2021: First-class Scholarship of Wuhan University (5,000RMB¥, Top 10%)

2021: 1^st Runner-up of ICCV 2021 MMVRAC Challenge (Track 2 and Track 3)

2020: Outstanding graduates of Southeast University (Top 3%)

2019: Meritorious Winner - Mathematical Contest In Modeling & Interdisciplinary Contest In Modeling, 2019

2018: National Scholarship (Highest Honor for undergraduates in China, 8,000RMB¥, Top 3%)

This homepage is designed based on Jon Barron's website and deployed on Github Pages. Last updated: Jan. 2026
© 2026 Jiaxu Zhang