Skip to content

NhuGiap04/Test-time-steering

 
 

Repository files navigation

FLUX

Official inference codebase for the FLUX family of flow-matching text-to-image models by Black Forest Labs. FLUX models use a transformer architecture that performs flow matching in latent space, conditioned on text encoded by T5 and CLIP.

Available Models

Name Description
flux-dev Guidance-distilled 12B model, high quality
flux-dev-krea FLUX.1-Krea-dev variant
flux-schnell Distilled 4-step model, fast inference
flux-dev-canny ControlNet-style Canny edge conditioning
flux-dev-depth Depth map conditioning
flux-dev-fill Inpainting / outpainting
flux-dev-redux Image-conditioned generation
flux-dev-kontext In-context image editing

Models are downloaded automatically from HuggingFace on first use. Several are gated and require you to accept their terms on HuggingFace before downloading.


Setup

Requirements

  • Python >= 3.10
  • CUDA GPU (recommended)

Install

# Clone the repo git clone https://github.com/NhuGiap04/Test-time-steering.git cd Test-time-steering # Install with PyTorch support pip install -e ".[torch]" # Or install everything (Gradio + Streamlit demos too) pip install -e ".[all]"

HuggingFace Authentication

Gated models (e.g. flux-dev) require a HuggingFace account with access granted. Authenticate before running:

huggingface-cli login # or set the environment variable export HF_TOKEN=your_token_here

Basic Inference

Command-line (single image)

flux --name flux-schnell \ --prompt "a serene mountain lake at sunrise" \ --width 1360 \ --height 768 \ --output_dir output/

For flux-dev, which supports guidance:

flux --name flux-dev \ --prompt "a photo of an astronaut riding a horse on Mars" \ --guidance 3.5 \ --num_steps 50 \ --output_dir output/

Generated images are saved to output/img_<idx>.jpg.

Interactive loop

flux --name flux-dev --loop

In loop mode you can change the prompt, resolution, guidance, seed, and steps between generations using slash commands (/w, /h, /g, /s, /n).

Python API

flux-schnell — fast 4-step model, no guidance

import torch from flux.util import load_ae, load_clip, load_flow_model, load_t5 from flux.sampling import denoise, get_noise, get_schedule, prepare, unpack device = torch.device("cuda") name = "flux-schnell" t5 = load_t5(device, max_length=256) # shorter context for schnell clip = load_clip(device) model = load_flow_model(name, device=device) ae = load_ae(name, device=device) H, W = 768, 1360 x = get_noise(1, H, W, device=device, dtype=torch.bfloat16, seed=42) inp = prepare(t5, clip, x, prompt="a cat sitting on a windowsill") timesteps = get_schedule(4, inp["img"].shape[1], shift=False) # 4 steps, no shift with torch.inference_mode(): x = denoise(model, **inp, timesteps=timesteps, guidance=0.0) x = unpack(x.float(), H, W) with torch.autocast(device_type="cuda", dtype=torch.bfloat16): x = ae.decode(x)

flux-dev — high-quality guidance-distilled model

import torch from flux.util import load_ae, load_clip, load_flow_model, load_t5 from flux.sampling import denoise, get_noise, get_schedule, prepare, unpack device = torch.device("cuda") name = "flux-dev" t5 = load_t5(device, max_length=512) # longer context for dev clip = load_clip(device) model = load_flow_model(name, device=device) ae = load_ae(name, device=device) H, W = 768, 1360 x = get_noise(1, H, W, device=device, dtype=torch.bfloat16, seed=42) inp = prepare(t5, clip, x, prompt="a photo of an astronaut riding a horse on Mars") timesteps = get_schedule(50, inp["img"].shape[1], shift=True) # 50 steps with shift with torch.inference_mode(): x = denoise(model, **inp, timesteps=timesteps, guidance=3.5) x = unpack(x.float(), H, W) with torch.autocast(device_type="cuda", dtype=torch.bfloat16): x = ae.decode(x)

flux-dev-krea — fine-tuned variant of flux-dev

flux-dev-krea is a fine-tune of flux-dev and shares the exact same architecture and parameter count (~12B). Usage is identical — just swap the model name:

import torch from flux.util import load_ae, load_clip, load_flow_model, load_t5 from flux.sampling import denoise, get_noise, get_schedule, prepare, unpack device = torch.device("cuda") name = "flux-dev-krea" # only this line changes vs flux-dev t5 = load_t5(device, max_length=512) clip = load_clip(device) model = load_flow_model(name, device=device) ae = load_ae(name, device=device) H, W = 768, 1360 x = get_noise(1, H, W, device=device, dtype=torch.bfloat16, seed=42) inp = prepare(t5, clip, x, prompt="a serene mountain lake at sunrise") timesteps = get_schedule(50, inp["img"].shape[1], shift=True) with torch.inference_mode(): x = denoise(model, **inp, timesteps=timesteps, guidance=3.5) x = unpack(x.float(), H, W) with torch.autocast(device_type="cuda", dtype=torch.bfloat16): x = ae.decode(x)

Gradio / Streamlit demos

python demo_gr.py # Gradio UI python demo_st.py # Streamlit UI python demo_st_fill.py # Streamlit inpainting UI

Optional: TensorRT Backend

For faster inference with TensorRT:

pip install -e ".[tensorrt]" flux --name flux-dev --trt --prompt "your prompt here"

Pre-exported ONNX models are downloaded automatically from the corresponding HuggingFace ONNX repository.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 98.6%
  • Jupyter Notebook 1.2%
  • Shell 0.2%