Zhening Huang

I'm currently a fourth year PhD student in Information Engineering at the University of Cambridge, working with Professor Joan Lasenby.

My research focuses on 3D vision, with experience in open world understanding, 3D reconstruction and graphics for appearance modeling. Recently I started to work on video diffusion models for the vision of learning 4D dynamic world.

Email / Scholar / Twitter / Github / Linkedin

✨ LiteReality Project ✨

News

[Feb 2026] SpaceTimePilot is accepted at CVPR 2026! Huge thanks to all the co-authors! We are working on releasing the codebase and dataset.
[Jan 2026] We released the codebase for LiteReality! You can use it to scan your room and convert it to a compact, graphics-ready reconstruction with full PBR materials. Check out the code and examples.
[Dec 2025] We released SpaceTimePilot. Check out the video and webpage. SpaceTimePilot disentangles space and time in video diffusion model, for implicit 4D reconstruction and exploration.
[Oct 2025] Fortunate to receive the NeurIPS 2025 Scholar Award! Grateful for the support.
[Sep 2025] LiteReality is accepted at NeurIPS 2025! Thanks to all the collaborators.
[July 2025] We released LiteReality. Check out the video and webpage. Working on producing a good open source codebase for this!
[June 2025] Started internship at Adobe Research, working with Chun-Hao Huang. Amazing experience so far!
[July 2024] OpenIns3D is accepted at ECCV 2024. Thanks to all the collaborators.
[April 2024] Attended BMVA symposium on Multimodal Learning in London. OpenIns3D got the Best Paper Award!
[Sep. 2023] We released OpenIns3d, the first 2D-input free pipeline for open-world 3D instance segmentation.
[Jan. 2023] Started a 3-month internship at Toshiba Cambridge AI Lab, working on language interaction with point clouds.
[May 2022] Very lucky to be the 2022 Girton Postgraduate Research Award recipient!

---- show more ----

Selected publication

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time
Zhening Huang, Hyeonho Jeong, Xuelin Chen, Yulia Gryaditskaya, Tuanfeng Y. Wang, Joan Lasenby, Chun-Hao Huang

CVPR 2026

Project Page / Paper / Video / GitHub

TLDR: SpaceTimePilot disentangles space and time in video diffusion model for controllable generative rendering. Given a single input video of a dynamic scene, SpaceTimePilot freely steers both camera viewpoint and temporal motion within the scene, enabling freely exploration across the 4D space–time domain.

LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans
Zhening Huang, Xiaoyang Wu, Fangcheng Zhong, Hengshuang Zhao, Matthias Nießner, Joan Lasenby

NeurIPS 2025

Project Page / Paper / Video / GitHub

TLDR: LiteReality is an automatic pipeline that converts RGB-D scans of indoor environments into graphics-ready scenes with high-quality meshes, PBR materials, and articulated objects ready for rendering and physics-based interactions.

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation
Zhening Huang, Xiaoyang Wu, Xi Chen, Hengshuang Zhao, Lei Zhu, Joan Lasenby
Project Page / Paper / Video / GitHub

ECCV 2024

TLDR: OpenIns3D proposes a "mask-snap-lookup" scheme to achieve 2D-input-free 3D open-world scene understanding, which attains SOTA performance across datasets, even with fewer input prerequisites.

Design and source code from Jon Barron's website