feat(nvproxy): support nvidia-container-runtime csv mode by a7i · Pull Request #12794 · google/gvisor

a7i · 2026-03-25T23:57:46Z

Summary

nvproxy previously tied host prep, nvidia-container-cli configure, and synthetic /dev/nvidia* creation to the presence of nvidia-container-runtime-hook. CSV mode (and JIT CDI) removes that hook and injects devices/mounts via the OCI spec instead, so those steps were skipped.

This change:

Runs host prep (nvProxyPreGoferHostSetup) whenever GPUFunctionalityRequested (including /dev/nvidiactl in Linux.Devices).
Runs nvidia-container-cli configure only on the legacy hook path (GPUFunctionalityNeedsNvidiaContainerCLIConfigure).
Creates synthetic sentry device nodes only when the spec does not already list /dev/nvidiactl.
Skips prestart hooks: nvidia-cdi-hook, nvidia-ctk, nvidia-container-toolkit (same rationale as the legacy hook).
Updates GPU user guide to document CSV mode support.

How to test locally

Unit tests (Linux x86_64/arm64 recommended)

bazel test //runsc/specutils:specutils_test --test_output=errors

On macOS, the full gVisor build may fail on unrelated Darwin issues (O_LARGEFILE, etc.); use Linux or the project CI.

Manual GPU / CSV smoke test (Linux host with NVIDIA driver + toolkit)

Build runsc with nvproxy (from repo root):

make build TARGETS=runsc:runsc # or: bazel build //runsc:runsc

Configure NVIDIA runtime (/etc/nvidia-container-runtime/config.toml):
- Set mode = "csv" (or auto if it selects CSV on your platform, e.g. some Jetson/Tegra setups).
- Under [nvidia-container-runtime], set runtimes so the first entry is your runsc wrapper, e.g. a script that runs:
```
exec /path/to/runsc --nvproxy "$@"
```
Run a GPU container via the NVIDIA shim (not plain runsc alone), so the spec is modified:
```
sudo nvidia-container-runtime run --bundle /path/to/bundle <container-id>
```
Or with Docker using NVIDIA as default runtime (see NVIDIA runtime README for csv vs --gpus).
Confirm: container starts, nvidia-smi or a CUDA sample runs, and debug logs show no failure from skipped NVIDIA hooks / duplicate device setup.

Risk

Low — scoped to nvproxy detection, hook skipping, and docs; behavior unchanged for legacy hook path.

Treat GPU detection and legacy hook replication separately: run host prep whenever GPU is requested from the OCI spec, run nvidia-container-cli configure only for the legacy prestart-hook path, synthesize sentry /dev/nvidia* only when spec lacks /dev/nvidiactl, and skip CDI-era NVIDIA prestart hooks (nvidia-cdi-hook, nvidia-ctk, nvidia-container-toolkit). Covers CSV/CDI specs that inject Linux.Devices and mounts without nvidia-container-runtime-hook.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(nvproxy): support nvidia-container-runtime csv mode#12794

feat(nvproxy): support nvidia-container-runtime csv mode#12794
a7i wants to merge 1 commit intogoogle:masterfrom
a7i:feat/nvproxy-csv-mode

a7i commented Mar 25, 2026

Labels

1 participant

Conversation