Model Validation Controller

This project is a proof of concept based on the sigstore/model-transperency-cli. It offers a Kubernetes/OpenShift operator designed to validate AI models before they are picked up by actual workload. This project provides a webhook that adds an initcontainer to perform model validation. The operator uses a custom resource to define how the models should be validated, such as utilizing Sigstore or public keys.

Features

Model Validation: Ensures AI models are validated before they are used by workloads.
Webhook Integration: A webhook automatically injects an initcontainer into pods to perform the validation step.
Custom Resource: Configurable ModelValidation custom resource to specify how models should be validated.
- Supports methods like Sigstore, pki or public key validation.
Continuous Validation: Optional periodic re-validation of models using Kubernetes native sidecars (requires Kubernetes 1.28+).

Prerequisites

Kubernetes 1.29+ or OpenShift 4.16+ (Kubernetes 1.28+ for continuous validation)
Proper configuration for model validation (e.g., Sigstore, public keys)
A signed model (e.g. check the testdata or examples folder)

Installation

The operator can be installed via kustomize using different deployment overlays.

Production Deployment

For production environments with cert-manager integration:

Prerequisites: Install cert-manager first:

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yaml

Then deploy the operator:

kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/production # or local kubectl apply -k config/overlays/production

Testing Deployment

For testing environments with manual certificate management:

kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/testing # or local kubectl apply -k config/overlays/testing

Development Deployment

For development environments, deploying the operator without the webhook integration:

kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/development # or local kubectl apply -k config/overlays/development

OLM Deployment

For OpenShift/OLM environments:

kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/olm # or local kubectl apply -k config/overlays/olm

Uninstall

To uninstall the operator, use the same overlay you used for installation:

kubectl delete -k config/overlays/production

Configuration Structure

The operator uses a kustomize based, overlay configuration structure, aiming to separate generated content from environment specific content:

config/ ├── crd/ # Custom Resource Definitions ├── rbac/ # RBAC permissions ├── webhook/ # Webhook configuration ├── manager/ # Controller manager deployment ├── manifests/ # OLM manifests ├── components/ # Reusable components │ ├── webhook/ # Webhook service component │ ├── certmanager/ # Certificate manager component │ ├── manual-tls/ # Manual TLS configuration │ ├── metrics-port/ # Metrics configuration │ └── webhook-replacements/ # Webhook configuration replacements └── overlays/ # Environment-specific overlays ├── production/ # Production (cert-manager) ├── development/ # Development (operator only, no webhooks) ├── testing/ # Testing (manual, self-signed certs) └── olm/ # OpenShift/OLM

Certificate Management

The operator supports different certificate management approaches:

Production: Uses cert-manager for automatic certificate management
- ⚠️ Important: The default cert-manager configuration uses self-signed certificates
- For production environments, you should configure cert-manager with a proper CA issuer
Development: Does not use certificates, there are no webhook configurations in this overlay
Testing: Uses manual, self-signed certificate management for testing scenarios
OLM: Uses OLM's built-in certificate management for OpenShift deployments

Running the Webhook Server Locally

The webhook server requires TLS certificates. When you run the operator locally, certificates will be generated automatically:

make run

This command will start the webhook server on https://localhost:9443, using the generated certs.

Known limitations

The project is at an early stage and therefore has some limitations.

There is no validation or defaulting for the custom resource.
The validation is namespace scoped and cannot be used across multiple namespaces.
There are no status fields for the custom resource.
The model and signature path must be specified, there is no auto discovery.
TLS certificates used by the webhook are self generated.

Usage

First, a ModelValidation CR must be created as follows:

apiVersion: ml.sigstore.dev/v1alpha1 kind: ModelValidation metadata: name: demo spec: config: sigstoreConfig: certificateIdentity: "https://github.com/sigstore/model-validation-operator/.github/workflows/sign-model.yaml@refs/tags/v0.0.2" certificateOidcIssuer: "https://token.actions.githubusercontent.com" model: path: /data/tensorflow_saved_model signaturePath: /data/tensorflow_saved_model/model.sig

Pods in the namespace that have the label validation.ml.sigstore.dev/ml: "<modelvalidation-cr-name>" will be validated using the specified ModelValidation CR. It should be noted that this does not apply to subsequently labeled pods.

apiVersion: v1 kind: Pod metadata: name: whatever-workload + labels: + validation.ml.sigstore.dev/ml: "demo" spec: restartPolicy: Never containers: - name: whatever-workload image: nginx ports: - containerPort: 80 volumeMounts: - name: model-storage mountPath: /data volumes: - name: model-storage persistentVolumeClaim: claimName: models

Continuous Model Validation

The operator supports continuous validation, which periodically re-validates models after the initial validation. This feature uses Kubernetes 1.28+ native sidecars with restartPolicy: Always.

How It Works

When continuous validation is enabled:

The validation container runs as a native sidecar (not just an init container)
After the initial validation succeeds, the container becomes ready
The validation repeats at the specified interval
On validation failure, the error is logged but the container continues running
The readiness probe reflects the validation state

Configuration

Add the continuousValidation field to your ModelValidation CR:

apiVersion: ml.sigstore.dev/v1alpha1 kind: ModelValidation metadata: name: demo-continuous spec: config: sigstoreConfig: certificateIdentity: "user@example.com" certificateOidcIssuer: "https://token.actions.githubusercontent.com" model: path: /data/tensorflow_saved_model signaturePath: /data/tensorflow_saved_model/model.sig continuousValidation: enabled: true interval: "10m" # Supports s, m, h units (e.g., "30s", "5m", "1h")

Requirements

Kubernetes 1.28 or later (for native sidecar support with restartPolicy: Always)
The validation container will consume resources continuously (CPU/memory)
Consider longer intervals (e.g., 10m, 1h) for production workloads

Examples

The example folder contains example files for testing the operator.

Example Continuous Validation

See examples/continuous-validation.yaml for a complete example.

Prerequisites for Examples

Before running the examples, create a namespace for testing (separate from the operator namespace):

kubectl create namespace testing

Important: Do not deploy examples in the operator namespace (e.g., model-validation-operator-system). The operator namespace has the label validation.ml.sigstore.dev/ignore: "true" which prevents the webhook from processing pods in that namespace.

Example Files

prepare.yaml: Contains a persistent volume claim and a job that downloads a signed test model.

kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/prepare.yaml -n testing # or local kubectl apply -f examples/prepare.yaml -n testing

verify.yaml: Contains a model validation manifest for the validation of this model and a demo pod, which is provided with the appropriate label for validation.

kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/verify.yaml -n testing # or local kubectl apply -f examples/verify.yaml -n testing

unsigned.yaml: Contains an example of a pod that would fail validation (for testing purposes).

kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/unsigned.yaml -n testing # or local kubectl apply -f examples/unsigned.yaml -n testing

After the example installation, the logs of the generated job should show a successful download:

$ kubectl logs -n testing job/download-extract-model Connecting to github.com (140.82.121.3:443) Connecting to objects.githubusercontent.com (185.199.108.133:443) saving to '/data/tensorflow_saved_model.tar.gz' tensorflow_saved_mod 44% |************** | 3983k 0:00:01 ETA tensorflow_saved_mod 100% |********************************| 8952k 0:00:00 ETA '/data/tensorflow_saved_model.tar.gz' saved ./ ./model.sig ./variables/ ./variables/variables.data-00000-of-00001 ./variables/variables.index ./saved_model.pb ./fingerprint.pb

The operator logs should show that a pod has been modified:

$ kubectl logs -n model-validation-operator-system deploy/model-validation-controller-manager time=2025-01-20T22:13:05.051Z level=INFO msg="Starting webhook server on :9443" time=2025-01-20T22:13:47.556Z level=INFO msg="new request, path: /mutate-v1-pod" time=2025-01-20T22:13:47.557Z level=INFO msg="Execute webhook" time=2025-01-20T22:13:47.560Z level=INFO msg="Search associated Model Validation CR" pod=whatever-workload namespace=testing time=2025-01-20T22:13:47.591Z level=INFO msg="construct args" time=2025-01-20T22:13:47.591Z level=INFO msg="found sigstore config"

Finally, the test pod should be running and the injected initcontainer should have been successfully validated.

$ kubectl logs -n testing whatever-workload model-validation INFO:__main__:Creating verifier for sigstore INFO:tuf.api._payload:No signature for keyid f5312f542c21273d9485a49394386c4575804770667f2ddb59b3bf0669fddd2f INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:__main__:Verifying model signature from /data/model.sig INFO:__main__:all checks passed

In case the workload is modified, is not executed:

ERROR:__main__:verification failed: the manifests do not match

Ignore Options

The model section of the ModelValidation CR supports additional options to control which files are included during verification:

Field	Type	Description
`ignorePaths`	`[]string`	List of file paths to exclude from verification
`ignoreGitPaths`	`bool`	When `true`, excludes git-related files (e.g., `.git/`, `.gitignore`)
`ignoreUnsignedFiles`	`bool`	When `true`, unsigned files will not cause verification to fail
`allowSymlinks`	`bool`	When `true`, symbolic links will be followed and their targets verified

Example with ignore options:

apiVersion: ml.sigstore.dev/v1alpha1 kind: ModelValidation metadata: name: demo spec: config: sigstoreConfig: certificateIdentity: "https://github.com/sigstore/model-validation-operator/.github/workflows/sign-model.yaml@refs/tags/v0.0.2" certificateOidcIssuer: "https://token.actions.githubusercontent.com" model: path: /data/tensorflow_saved_model signaturePath: /data/tensorflow_saved_model/model.sig ignorePaths: - /data/tensorflow_saved_model/cache - /data/tensorflow_saved_model/tmp ignoreGitPaths: true allowSymlinks: true

Pod Annotations

Ignore options can also be specified or overridden on individual pods using annotations. Pod annotations take precedence over the ModelValidation CR settings.

Annotation	Value	Description
`validation.ml.sigstore.dev/ignore-paths`	Comma-separated paths	Paths to exclude from verification
`validation.ml.sigstore.dev/ignore-git-paths`	`"true"` or `"false"`	Exclude git-related files
`validation.ml.sigstore.dev/ignore-unsigned-files`	`"true"` or `"false"`	Allow unsigned files
`validation.ml.sigstore.dev/allow-symlinks`	`"true"` or `"false"`	Follow symbolic links

Example pod with annotation overrides:

apiVersion: v1 kind: Pod metadata: name: whatever-workload labels: validation.ml.sigstore.dev/ml: "demo" annotations: validation.ml.sigstore.dev/ignore-paths: "/data/tensorflow_saved_model/logs,/data/tensorflow_saved_model/tmp" validation.ml.sigstore.dev/ignore-git-paths: "true" spec: # ... rest of pod spec

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
.devcontainer		.devcontainer
.github		.github
api/v1alpha1		api/v1alpha1
cmd		cmd
config		config
examples		examples
hack		hack
internal		internal
scripts		scripts
test		test
testdata		testdata
tls		tls
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
Dockerfile		Dockerfile
Dockerfile.agent		Dockerfile.agent
LICENSE		LICENSE
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
ROADMAP.md		ROADMAP.md
bundle.Dockerfile		bundle.Dockerfile
generate-tls-openssl.sh		generate-tls-openssl.sh
generate-tls.sh		generate-tls.sh
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model Validation Controller

Features

Prerequisites

Installation

Production Deployment

Testing Deployment

Development Deployment

OLM Deployment

Uninstall

Configuration Structure

Certificate Management

Running the Webhook Server Locally

Known limitations

Usage

Continuous Model Validation

How It Works

Configuration

Requirements

Examples

Example Continuous Validation

Prerequisites for Examples

Example Files

Ignore Options

Pod Annotations

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Model Validation Controller

Features

Prerequisites

Installation

Production Deployment

Testing Deployment

Development Deployment

OLM Deployment

Uninstall

Configuration Structure

Certificate Management

Running the Webhook Server Locally

Known limitations

Usage

Continuous Model Validation

How It Works

Configuration

Requirements

Examples

Example Continuous Validation

Prerequisites for Examples

Example Files

Ignore Options

Pod Annotations

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages