Skip to content
View thuwzy's full-sized avatar
🎃
Focusing
🎃
Focusing

Block or report thuwzy

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation"

Python 496 26 Updated Mar 21, 2026

Advancing Open-source World Models

Python 3,207 263 Updated Mar 5, 2026

VIGA: Vision-as-Inverse-Graphics Agent

Python 903 84 Updated Feb 25, 2026

Muon is an optimizer for hidden layers in neural networks

Python 2,418 110 Updated Jan 19, 2026
45 Updated Nov 26, 2025

SAM 3D Objects

Python 6,274 722 Updated Mar 12, 2026
C++ 215 8 Updated Mar 2, 2026

[ICLR 2026] ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation

Python 679 41 Updated Nov 20, 2025

Qwen-Image-Lightning: Speed up Qwen-Image model with distillation

Python 1,266 44 Updated Jan 1, 2026

Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.

1,381 42 Updated Mar 9, 2026

Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference

Python 1,263 42 Updated Feb 24, 2026

Official code of RDT 2

Python 741 45 Updated Feb 7, 2026

Hunyuan3D-Omni: A Unified Framework for Controllable Generation of 3D Assets

Python 537 48 Updated Oct 17, 2025

ViPE: Video Pose Engine for Geometric 3D Perception

Python 1,797 145 Updated Jan 1, 2026

A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…

21,587 2,205 Updated Dec 12, 2025

Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.

Python 1,523 158 Updated Dec 17, 2025
Jupyter Notebook 443 22 Updated Dec 8, 2025

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Python 833 13 Updated Dec 14, 2025

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition

Python 696 76 Updated Nov 28, 2025

Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive World Model

Python 1,883 203 Updated Oct 4, 2025

Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt.

Python 671 48 Updated Nov 25, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 7,588 461 Updated Feb 10, 2026

Code for "Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers" (NeurIPS 2024)

Python 206 11 Updated Oct 20, 2025

Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model

Python 2,721 240 Updated Dec 17, 2025

PhysX: Physical-Grounded 3D Asset Generation (NeurIPS 2025, Spotlight)

Jupyter Notebook 360 20 Updated Dec 18, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,653 1,385 Updated Mar 3, 2026

Towards a Generative 3D World Engine for Embodied Intelligence

Python 403 24 Updated Jan 28, 2026

[ICLR'26] Topology-Preserved Auto-regressive Mesh Generation in the Manner of Weaving Silk

Python 108 4 Updated Mar 2, 2026

Code implementation for: From Virtual Games to Real-World Play

46 1 Updated Jun 23, 2025

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

2,130 134 Updated Mar 20, 2026
Next