[NeurIPS 2025] Decomposable Flow Matching

Improving Progressive Generation with Decomposable Flow Matching

Moayed Haji-Ali*, Willi Menapace*, Ivan Skorokhodov, Arpit Sahni, Sergey Tulyakov, Vicente Ordonez, Aliaksandr Siarohin

Snap Research & Rice University

🚀 Check Out Our Latest Work! 🎥🔊

One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers
Learn how dynamic compute allocation in DiTs can accelerate convergence by up to 2.5× while enabling a single model to flexibly operate across a wide range of inference compute budgets. This code repo also contains the training and inference scripts for ELIT.

TL;DR: Decomposable Flow Matching (DFM) is a simple framework to progressively generate visual modalities scale-by-scale, achieving up to 50% faster convergence compared to Flow Matching. DFM applies flow matching independently at each level of a multi-scale representation (e.g., a Laplacian pyramid) in an end-to-end fashion, staying compatible with standard flow-matching pipelines while improving quality and convergence speed.

Disclaimer

This repo provides a reimplementation of DFM on top of SiT, following REPA setup. The architecture does not exactly follow the one used in the paper and results might be different. Below, we provide comparison between SiT and DFM produced using this repo.

Method Implementation

Decomposable Flow Matching (DFM) combines multiscale decomposition with Flow Matching. DFM progressively synthesizes different representation scales by generating the coarse-structure scale first and incrementally refining it with finer scales.

DFM Architecture: Modifies the DiT architecture to use per-scale patchification and timestep-embedding layers while keeping the core DiT architecture untouched.
DFM Training: Samples the stage count from a categorical distribution, draws each stage flow-timestep from a logit-normal distribution biased toward lower noise in early stages, and trains one DiT backbone to jointly predict all stage-wise velocities.
DFM Inference: Denoises the coarsest stage first for T₁ steps and activates next stages after the previous ones reach a predetermined per-stage noise threshold.

Experimental Results

Method	FID	sFID	IS	Precision	Recall
SiT-XL/2	33.24	8.55	48.22	0.308	0.581
DFM-SiT-XL/2	18.27	6.51	85.50	0.452	0.557

Pretrained checkpoints of the above experiments will be released soon.

1. Environment Setup

conda create -n dfm python=3.9 -y conda activate dfm pip install -r requirements.txt

2. Dataset

2.1 Dataset Download

Download ImageNet. Then run the following processing and VAE latent extraction scripts.

# Convert raw ImageNet data to a ZIP archive at 256x256 resolution python dataset_tools.py convert \ --source=[YOUR_DOWNLOAD_PATH]/ILSVRC/Data/CLS-LOC/train \ --dest=[TARGET_PATH]/images \ --resolution=256x256 \ --transform=center-crop-dhariwal

# Convert the pixel data to VAE latents python dataset_tools.py encode \ --source=[TARGET_PATH]/images \ --dest=[TARGET_PATH]/vae-sd

Here, YOUR_DOWNLOAD_PATH is the directory where you downloaded the dataset, and TARGET_PATH is the directory where you will save the preprocessed images and corresponding compressed latent vectors. This directory will be used for your experiment scripts.

3. Training

Training uses the unified train.py script with YAML configuration files or CLI arguments. Update data_dir in the config to point to your data directory.

# From CLI args accelerate launch train.py --model [MODEL_NAME] --exp-name [EXP_NAME] --data-dir [DATA_DIR] # Or from YAML config accelerate launch train.py --config [CONFIG_PATH] --data-dir [DATA_DIR]

where [MODEL_NAME] can be specificed as SiT or DFM-SiT baselines (e.g SiT-XL/2 or DFM-SiT-XL/2)

Sample training configurations can be found in experiments/train

Example Training

# From CLI args accelerate launch train.py --model DFM-SiT-XL/2 --exp-name dfm-sit-xl-2-256px --data-dir [DATA_DIR] # Or from YAML config accelerate launch train.py --config experiments/train/dfm_sit_b_256.yaml --data-dir [DATA_DIR]

Key DFM Hyperparameters

The main DFM-specific options to adjust are:

Parameter	Description	Default
`model`	Model architecture: `SiT-B/2`, `SiT-XL/2`, `DFM-SiT-B/2`, `DFM-SiT-XL/2` etc.	—
`stages_count`	Number of stages in DFM	`2`
`stage_weights`	Sampling weights of each stage during training	`[0.9, 0.1]`
`num_steps_per_scale`	Number of inference steps for each stage	`[40, 10]`
`stage_sampling_thresholds`	Noise threshold where next stage generation is initialized	`[0.1]`

Please refer to the paper for guidelines on choosing DFM hyperparameters.

4. Sampling

Sampling uses the unified generate.py script with DDP:

4.1 SiT

# From CLI args torchrun --nproc_per_node=8 generate.py \ --model SiT-B/2 --ckpt exps/sit-b-2-256px/checkpoints/0400000.pt # Or from YAML config torchrun --nproc_per_node=8 generate.py \ --config experiments/generation/sit_b_256.yaml \ --ckpt exps/sit-b-2-256px/checkpoints/0400000.pt

4.2 DFM-SiT

# From CLI args torchrun --nproc_per_node=8 generate.py \ --model DFM-SiT-B/2 --ckpt exps/dfm-sit-b-2-256px/checkpoints/0400000.pt # Or from YAML config torchrun --nproc_per_node=8 generate.py \ --config experiments/generation/dfm_sit_b_256.yaml \ --ckpt exps/dfm-sit-b-2-256px/checkpoints/0400000.pt

5. Evaluation

We provide evaluation scripts in experiments/evaluation/ that generate samples and compute FID, sFID, IS, Precision, and Recall.

bash experiments/evaluation/eval_dfm_sit_b_256.sh

This will generate samples under the results/ directory and an .npz file which can be used for evaluation. To run the reference TensorFlow evaluation on ImageNet, we use the ADM evaluation suite.

Note: Please make sure that the model hyperparameters match the training ones and refer to the paper for guidelines on choosing DFM inference hyperparameters.

Acknowledgement

This code is mainly built upon REPA. We thank the authors for open-sourcing their codebase.

BibTeX

@article{dfm, title={Improving Progressive Generation with Decomposable Flow Matching}, author={Moayed Haji-Ali and Willi Menapace and Ivan Skorokhodov and Arpit Sahni  and Sergey Tulyakov and Vicente Ordonez and Aliaksandr Siarohin}, journal={arXiv preprint arXiv:2506.19839}, year={2025} }

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
dfm_utils		dfm_utils
docs		docs
elit_utils		elit_utils
evaluation		evaluation
experiments		experiments
models		models
preprocessing		preprocessing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
generate.py		generate.py
loss.py		loss.py
requirements.txt		requirements.txt
samplers.py		samplers.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[NeurIPS 2025] Decomposable Flow Matching

🚀 Check Out Our Latest Work! 🎥🔊

Disclaimer

Method Implementation

Experimental Results

1. Environment Setup

2. Dataset

2.1 Dataset Download

3. Training

Example Training

Key DFM Hyperparameters

4. Sampling

4.1 SiT

4.2 DFM-SiT

5. Evaluation

Acknowledgement

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS 2025] Decomposable Flow Matching

🚀 Check Out Our Latest Work! 🎥🔊

Disclaimer

Method Implementation

Experimental Results

1. Environment Setup

2. Dataset

2.1 Dataset Download

3. Training

Example Training

Key DFM Hyperparameters

4. Sampling

4.1 SiT

4.2 DFM-SiT

5. Evaluation

Acknowledgement

BibTeX

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages