[docs] Add NeMo Automodel training guide#13306

Draft

pthombre wants to merge 5 commits intohuggingface:mainfrom

pthombre:automodel_docs

pthombre commented Mar 22, 2026

What does this PR do?

Adds a new documentation page for NeMo Automodel, NVIDIA's PyTorch DTensor-native training library for fine-tuning and pretraining
diffusion models at scale. NeMo Automodel integrates directly with Diffusers — it loads pretrained models from the Hugging Face Hub using
Diffusers model classes and generates outputs via Diffusers pipelines with no checkpoint conversion needed.

The new guide covers:

Supported models (Wan 2.1, FLUX.1-dev, HunyuanVideo 1.5)
Installation
Data preparation and preprocessing
Training configuration (annotated YAML reference)
Single-node and multi-node training launch
Generation / inference with fine-tuned checkpoints
How NeMo Automodel integrates with the Diffusers ecosystem
Hardware requirements

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@stevhliu @sayakpaul

[docs] Add NeMo Automodel training guide

02a61c2

Signed-off-by: Pranav Prashant Thombre <pthombre@nvidia.com>

sayakpaul requested a review from stevhliu

March 23, 2026 03:22

stevhliu reviewed

View reviewed changes

Member

stevhliu left a comment

super nice, thanks for the docs!

docs/source/en/training/nemo_automodel.md Outdated Show resolved Hide resolved

docs/source/en/training/nemo_automodel.md Outdated Show resolved Hide resolved

docs/source/en/training/nemo_automodel.md

Comment on lines +19 to +25

       ### Why NeMo Automodel?  
    
  
    - **Hugging Face native**: Train any Diffusers-format model from the Hub with no checkpoint conversion — day-0 support for new model releases.  
    - **Any scale**: The same YAML recipe and training script runs on 1 GPU or across hundreds of nodes. Parallelism is configuration, not code.  
    - **High performance**: FSDP2 distributed training with multiresolution bucketed dataloading and pre-encoded latent space training for maximum GPU utilization.  
    - **Hackable**: Linear training scripts with YAML configuration files. No hidden trainer abstractions — you can read and modify the entire training loop.  
    - **Open source**: Apache 2.0 licensed, NVIDIA-supported, and actively maintained.  
 

Member

stevhliu Mar 23, 2026

i would integrate this info in the opening intro paragraph to simplify the structure a bit

docs/source/en/training/nemo_automodel.md

       - **Hackable**: Linear training scripts with YAML configuration files. No hidden trainer abstractions — you can read and modify the entire training loop.  
    - **Open source**: Apache 2.0 licensed, NVIDIA-supported, and actively maintained.  
    
    ### Workflow overview

Member

stevhliu Mar 23, 2026

hmm i don't know if this workflow adds that much value that words could just convey?

docs/source/en/training/nemo_automodel.md

       
    | Model | Hugging Face ID | Task | Parameters |  
    |-------|----------------|------|------------|  
    | Wan 2.1 T2V 1.3B | [`Wan-AI/Wan2.1-T2V-1.3B-Diffusers`](https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B-Diffusers) | Text-to-Video | 1.3B |

Member

stevhliu Mar 23, 2026

lets remove the backticks around the model name as its not a code element

docs/source/en/training/nemo_automodel.md

Comment on lines +267 to +268

       > [!TIP]  
    > Full example configs for all models are available in the [NeMo Automodel examples](https://github.com/NVIDIA-NeMo/Automodel/tree/main/examples/diffusion/finetune).  
 

Member

stevhliu Mar 23, 2026

Suggested change

    > [!TIP] 
  > Full example configs for all models are available in the [NeMo Automodel examples](https://github.com/NVIDIA-NeMo/Automodel/tree/main/examples/diffusion/finetune). 
 

docs/source/en/training/nemo_automodel.md

Comment on lines +264 to +265

       > [!NOTE]  
    > NeMo Automodel also supports **pretraining** diffusion models from randomly initialized weights. Set `mode: pretrain` in the model config. Pretraining example configs are available in the [NeMo Automodel examples](https://github.com/NVIDIA-NeMo/Automodel/tree/main/examples/diffusion/pretrain).  
 

Member

stevhliu Mar 23, 2026

Suggested change

    > [!NOTE] 
  > NeMo Automodel also supports **pretraining** diffusion models from randomly initialized weights. Set `mode: pretrain` in the model config. Pretraining example configs are available in the [NeMo Automodel examples](https://github.com/NVIDIA-NeMo/Automodel/tree/main/examples/diffusion/pretrain). 
 

docs/source/en/training/nemo_automodel.md

       
    ## Launch training  
    
    **Single-node training:**

Member

stevhliu Mar 23, 2026

let's also use the <hfoptions> tags for single-node training and multi-node training

docs/source/en/training/nemo_automodel.md

       
    After training, generate videos or images from text prompts using the fine-tuned checkpoint.  
    
    **Wan 2.1 (single-GPU):**

Member

stevhliu Mar 23, 2026

also use <hfoptions> tags here

docs/source/en/training/nemo_automodel.md

       - **Scalable training for Diffusers models**: NeMo Automodel adds distributed training capabilities (FSDP2, multi-node, multiresolution bucketing) that go beyond what the built-in Diffusers training scripts provide, while keeping the same model and pipeline interfaces.  
    - **Shared ecosystem**: any model, LoRA adapter, or pipeline component from the Diffusers ecosystem remains compatible throughout the training and inference workflow.  
    
    ## Hardware requirements

Member

stevhliu Mar 23, 2026

lets add the hardware requirements to the ## Installation section. better for users to know what the requirements are up front :)

pthombre and others added 4 commits

March 23, 2026 18:24

Update docs/source/en/training/nemo_automodel.md

7f47bbc

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Update docs/source/en/training/nemo_automodel.md

ead7ff9

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

adding contacts into the readme

92007e3

Merge pull request #1 from linnanwang/patch-1

ec60cdf

adding contacts into the readme

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment