Skip to content

ziangcao0312/DiffTF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Large-Vocabulary 3D Diffusion Model with Transformer

1S-Lab, Nanyang Technological University  2The Chinese University of Hong Kong; 3Shanghai AI Laboratory

DiffTF can generate large-vocabulary 3D objects with rich semantics and realistic texture.

📖 For more visual results, go checkout our project page

Installation

Clone this repository and navigate to it in your terminal. Then run:

bash install_difftf.sh

This will install the related python package that the scripts depend on.

Preparing data

Training

I. Triplane fitting

1. Training the shared decoder
conda activate difftf export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 #Omniobject3D python -m torch.distributed.launch --nproc_per_node 8 ./Triplanerecon/train.py --config ./Triplanerecon/configs/omni/train.txt \\ --datadir ./dataset/Omniobject3D/renders \\# dataset path --basedir ./Checkpoint \\# basepath --expname omni_sharedecoder \\# the ckpt will save in ./Checkpoint/omni_sharedecoder #ShapeNet python -m torch.distributed.launch --nproc_per_node 8 ./Triplanerecon/train.py --config ./Triplanerecon/configs/shapenet_car/train.txt \\ --datadir ./dataset/ShapeNet/renders_car --basedir ./Checkpoint \\# basepath --expname shapenet_sharedecoder \\# the ckpt will save in ./Checkpoint/shapenet_car_sharedecoder
2. Triplane fitting
conda activate difftf #Omniobject3D python ./Triplanerecon/train_single_omni.py \\ --config ./Triplanerecon/configs/omni/train_single.txt \\ #config path --num_gpu 1 --idx 0 \\ #using 1gpu to fit triplanes  --datadir ./dataset/Omniobject3D/renders \\# dataset path --basedir ./Checkpoint \\# basepath --expname omni_triplane \\# triplanes will save in ./Checkpoint/omni_triplane --decoderdir ./Checkpoint/omni_sharedecoder/300000.tar # ckpt of shared decoder #ShapeNet python ./Triplanerecon/train_single_shapenet.py \\ --config ./Triplanerecon/configs/shapenet_car/train_single.txt \\ --num_gpu 1 --idx 0 \\ #using 1gpu to fit triplanes  --datadir ./dataset/ShapeNet/renders_car \\# dataset path --basedir ./Checkpoint \\# basepath --expname shapenet_triplane \\# triplanes will save in ./Checkpoint/shapenet_triplane --decoderdir ./Checkpoint/shapenet_sharedecoder/300000.tar # ckpt of shared decoder #Using 8 gpus bash multi_omni.sh 8 #Using 8 gpus bash multi_shapenet.sh 8

Note: We input the related hyperparameters and settings in the config files. You can find them in ./configs/shapenet or ./configs/omni.

3. Preparing triplane for diffusion
#preparing triplanes for training diffusion python ./Triplanerecon/extract.py --basepath ./Checkpoint/omni_triplane \\ # path of triplanes --mode omni \\ # name of dataset (omni or shapenet) --newpath ./Checkpoint/omni_triplane_fordiffusion #new path of triplanes

II. Training Diffusion

cd ./3dDiffusion export PYTHONPATH=$PWD:$PYTHONPATH conda activate difftf cd scripts python image_train.py --datasetdir ./Checkpoint/omni_triplane_fordiffusion #path to fitted triplanes  --expname difftf_omni #ckpt will save in ./Checkpoint/difftf_omni

You may also want to train in a distributed manner. In this case, run the same command with mpiexec:

mpiexec -n 8 python image_train.py --datasetdir ./Checkpoint/omni_triplane_fordiffusion #path to fitted triplanes --expname difftf_omni #ckpt will save in ./Checkpoint/difftf_omni

Note: Hyperparameters about training are set in image_train.py while hyperparameters about architecture are set in ./improved_diffusion/script_util.py.

Note: Our fitted triplane can be downloaded via this link.

Inference

I. Sampling triplane using trained diffusion

Our pre-trained model can be founded in difftf_checkpoint/omni

python image_sample.py \\ --model_path ./Checkpoint/difftf_omni/model.pt #checkpoint_path --num_samples=5000 --save_path ./Checkpoint/difftf_omni # path of the generated triplanes

II. Rendering triplane using shared decoder

Our pre-trained share decoder can be founded in difftf_checkpoint/triplane decoder.zip

python ddpm_vis.py --config ./configs/omni/ddpm.txt --ft_path ./Checkpoint/omni_triplane_fordiffusion/003000.tar #path of shared decoder --triplanepath ./Checkpoint/difftf_omni/samples_5000x18x256x256.npz # path of generated triplanes --basedir ./Checkpoint \\# basepath --expname ddpm_omni_vis \\# triplanes will save in ./Checkpoint/omni_triplane --mesh 0 \\# whether to save mesh --testvideo \\# whether to save all images using video python ddpm_vis.py --config ./configs/shapenet_car/ddpm.txt --ft_path ./Checkpoint/shapenet_car_triplane_fordiffusion/003000.tar #path of shared decoder --triplanepath ./Checkpoint/difftf_shapenet/samples_5000x18x256x256.npz # path of generated triplanes --basedir ./Checkpoint \\# basepath --expname ddpm_shapenet_vis \\# triplanes will save in ./Checkpoint/omni_triplane --mesh 0 \\# whether to save mesh --testvideo \\# whether to save all images using video

References

If you find DiffTF useful for your work please cite:

@article{cao2023large, title={Large-Vocabulary 3D Diffusion Model with Transformer}, author={Cao, Ziang and Hong, Fangzhou and Wu, Tong and Pan, Liang and Liu, Ziwei}, journal={arXiv preprint arXiv:2309.07920}, year={2023} } 
Acknowledgement

The code is implemented based on improved-diffusion and nerf-pytorch. We would like to express our sincere thanks to the contributors.

🗞️ License

Distributed under the S-Lab License. See LICENSE for more information.

Flag Counter

About

Official PyTorch implementation of DiffTF (Accepted by ICLR2024)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors