[NeurIPS 2025] DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data

🚀 This repository is the official implementation of DIPO, which is a framework that generate articulated objects conditioned on Dual-State Image Pairs (resting and articulated states)

[NeruIPS 2025] DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
Ruiqi Wu, Xinjie Wang, Liu Liu, Chunle Guo*, Jiaxiong Qiu, Chongyi Li, Lichao Huang, Zhizhong Su, Ming-Ming Cheng
( * indicates corresponding author)

[Arxiv Paper] [中文版] [Website Page] [PM-X (dataset)] [Gradio Demo]

Preparation

Dependencies and Installation

We recommend to use miniconda to manage the environment. The environment was tested on Ubuntu 20.04.4 LTS.

# Create a conda environment conda create -n dipo python=3.10 conda activate dipo # Install Pytorch conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=11.8 -c pytorch -c nvidia # Install other packages pip install -r requirement.txt # Install Pytorch3D (for evaluation) pip install "git+https://github.com/facebookresearch/pytorch3d.git"

GPT-4o Settings

input your key of GPT-4o in scripts/graph_pred/api.py.

client = AzureOpenAI( azure_endpoint="your_endpoint", api_key="your_key", api_version="your_version", )

Download

PM-X Dataset

Our PM-X dataset is constructed by an agent system, named LEGO-Art. It builds complex articulated objects with primitives provieded by Partnet-Mobility dataset. You can download the novel dataset at link.

PM + ACD Dataset

You can download the origin data and our proprocessed data from here, for training and evaluation.

Checkpoints

You can download DIPO checkpoint file for inference and CAGE pre-trained weights for training from here.

<project directory> ├── ckpts │ ├── cage_cfg.ckpt │ ├── dipo.ckpt

3D assets for mesh retrieval

Download 3D assets for mesh retrieval from here, which also the original data of a subset of PartNet-Mobility Dataset.

Usage

Quick Demo

We provide a quick demo to run the inference on a dual-state image pair.

python demo_img.py \ --configs/config.yaml \ --ckpt_path ckpts/dipo.ckpt \ --img_path_1 path/of/the/resting/state/image \ --img_path_2 path/of/the/articulated/state/image

If you successfully run the script, the output will be saved at ./results. By default, there will be three objects generated out by initializing with different noises. For other configuration, please see the arguments in the script.

Evaluation

If you're interested in evaluating our model on the test set (see the data split in data/data_split.json for PartNet-Mobility, and in data/data_acd.json for ACD dataset), you can run the test script as below.

# Evaluate on the test set (given GT graph, no object category label) python test.py \ --config configs/config.yaml \ --ckpt ckpts/dipo.ckpt \ --label_free \ --which_data pm

The evaluation is only supported on a single GPU, which was tested on a NVIDIA 4090 (24GB).

Training

We train our model on top of a CAGE model pretrained under our setting. This checkpoint can be downloaded here, which is put under pretrained folder by default.

<project directory> ├── pretrained │ ├── cage_cfg.ckpt

Run the following command to train our model from scratch. The original model is trained on 4 NVIDIA A100s.

python train.py \ --config configs/config.yaml \ --pretrained_cage ckpts/cage_cfg.ckpt

LEGO-Art Pipeline

# Step-1: Roll description & Build grid-level data python scripts/layout_generator/api.py --save_path path/to/gpt/data --obj_num 3 # Step-2: Build data with coordinates python scripts/layout_generator/layout_generator_in_grid.py --save_path path/to/gpt/data # Step-3 Retrival python scripts/mesh_retrieval/retrieval.py --src_dir path/to/gpt/data --gt_data_root path/to/assets/for/retrieval # Step-4 Render data with Blender python scripts/render/render_dir.py --src_dir path/to/gpt/data # Step-5 Filter data with VLMs python scripts/layout_generator/api_filter.py --save_path path/to/gpt/data

Citation

@inproceedings{wu2025dipo, title={DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data}, author={Wu, Ruqi and Wang, Xinjie and Liu, Liu and Guo, Chunle and Qiu, Jiaxiong and Li, Chongyi and Huang, Lichao and Su, Zhizhong and Cheng, Ming-Ming}, booktitle={Advances in Neural Information Processing Systems 39 (NeurIPS 2025)}, year={2025} }

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[NeurIPS 2025] DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data

Preparation

Dependencies and Installation

GPT-4o Settings

Download

PM-X Dataset

PM + ACD Dataset

Checkpoints

3D assets for mesh retrieval

Usage

Quick Demo

Evaluation

Training

LEGO-Art Pipeline

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
configs		configs
dataset		dataset
examples		examples
lightning_logs		lightning_logs
metrics		metrics
models		models
my_utils		my_utils
objects		objects
results		results
retrieval		retrieval
scripts		scripts
systems		systems
.gitignore		.gitignore
demo_img.py		demo_img.py
readme.md		readme.md
requirement.txt		requirement.txt
test.py		test.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS 2025] DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data

Preparation

Dependencies and Installation

GPT-4o Settings

Download

PM-X Dataset

PM + ACD Dataset

Checkpoints

3D assets for mesh retrieval

Usage

Quick Demo

Evaluation

Training

LEGO-Art Pipeline

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages