This project is a proof of concept based on the sigstore/model-transperency-cli. It offers a Kubernetes/OpenShift operator designed to validate AI models before they are picked up by actual workload. This project provides a webhook that adds an initcontainer to perform model validation. The operator uses a custom resource to define how the models should be validated, such as utilizing Sigstore or public keys.
- Model Validation: Ensures AI models are validated before they are used by workloads.
- Webhook Integration: A webhook automatically injects an initcontainer into pods to perform the validation step.
- Custom Resource: Configurable
ModelValidationcustom resource to specify how models should be validated.- Supports methods like Sigstore, pki or public key validation.
- Continuous Validation: Optional periodic re-validation of models using Kubernetes native sidecars (requires Kubernetes 1.28+).
- Kubernetes 1.29+ or OpenShift 4.16+ (Kubernetes 1.28+ for continuous validation)
- Proper configuration for model validation (e.g., Sigstore, public keys)
- A signed model (e.g. check the
testdataorexamplesfolder)
The operator can be installed via kustomize using different deployment overlays.
For production environments with cert-manager integration:
Prerequisites: Install cert-manager first:
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yamlThen deploy the operator:
kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/production # or local kubectl apply -k config/overlays/productionFor testing environments with manual certificate management:
kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/testing # or local kubectl apply -k config/overlays/testingFor development environments, deploying the operator without the webhook integration:
kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/development # or local kubectl apply -k config/overlays/developmentFor OpenShift/OLM environments:
kubectl apply -k https://github.com/sigstore/model-validation-operator/config/overlays/olm # or local kubectl apply -k config/overlays/olmTo uninstall the operator, use the same overlay you used for installation:
kubectl delete -k config/overlays/productionThe operator uses a kustomize based, overlay configuration structure, aiming to separate generated content from environment specific content:
config/ ├── crd/ # Custom Resource Definitions ├── rbac/ # RBAC permissions ├── webhook/ # Webhook configuration ├── manager/ # Controller manager deployment ├── manifests/ # OLM manifests ├── components/ # Reusable components │ ├── webhook/ # Webhook service component │ ├── certmanager/ # Certificate manager component │ ├── manual-tls/ # Manual TLS configuration │ ├── metrics-port/ # Metrics configuration │ └── webhook-replacements/ # Webhook configuration replacements └── overlays/ # Environment-specific overlays ├── production/ # Production (cert-manager) ├── development/ # Development (operator only, no webhooks) ├── testing/ # Testing (manual, self-signed certs) └── olm/ # OpenShift/OLM The operator supports different certificate management approaches:
- Production: Uses cert-manager for automatic certificate management
⚠️ Important: The default cert-manager configuration uses self-signed certificates- For production environments, you should configure cert-manager with a proper CA issuer
- Development: Does not use certificates, there are no webhook configurations in this overlay
- Testing: Uses manual, self-signed certificate management for testing scenarios
- OLM: Uses OLM's built-in certificate management for OpenShift deployments
The webhook server requires TLS certificates. When you run the operator locally, certificates will be generated automatically:
make runThis command will start the webhook server on https://localhost:9443, using the generated certs.
The project is at an early stage and therefore has some limitations.
-
There is no validation or defaulting for the custom resource.
-
The validation is namespace scoped and cannot be used across multiple namespaces.
-
There are no status fields for the custom resource.
-
The model and signature path must be specified, there is no auto discovery.
-
TLS certificates used by the webhook are self generated.
First, a ModelValidation CR must be created as follows:
apiVersion: ml.sigstore.dev/v1alpha1 kind: ModelValidation metadata: name: demo spec: config: sigstoreConfig: certificateIdentity: "https://github.com/sigstore/model-validation-operator/.github/workflows/sign-model.yaml@refs/tags/v0.0.2" certificateOidcIssuer: "https://token.actions.githubusercontent.com" model: path: /data/tensorflow_saved_model signaturePath: /data/tensorflow_saved_model/model.sigPods in the namespace that have the label validation.ml.sigstore.dev/ml: "<modelvalidation-cr-name>" will be validated using the specified ModelValidation CR. It should be noted that this does not apply to subsequently labeled pods.
apiVersion: v1 kind: Pod metadata: name: whatever-workload + labels: + validation.ml.sigstore.dev/ml: "demo" spec: restartPolicy: Never containers: - name: whatever-workload image: nginx ports: - containerPort: 80 volumeMounts: - name: model-storage mountPath: /data volumes: - name: model-storage persistentVolumeClaim: claimName: modelsThe operator supports continuous validation, which periodically re-validates models after the initial validation. This feature uses Kubernetes 1.28+ native sidecars with restartPolicy: Always.
When continuous validation is enabled:
- The validation container runs as a native sidecar (not just an init container)
- After the initial validation succeeds, the container becomes ready
- The validation repeats at the specified interval
- On validation failure, the error is logged but the container continues running
- The readiness probe reflects the validation state
Add the continuousValidation field to your ModelValidation CR:
apiVersion: ml.sigstore.dev/v1alpha1 kind: ModelValidation metadata: name: demo-continuous spec: config: sigstoreConfig: certificateIdentity: "user@example.com" certificateOidcIssuer: "https://token.actions.githubusercontent.com" model: path: /data/tensorflow_saved_model signaturePath: /data/tensorflow_saved_model/model.sig continuousValidation: enabled: true interval: "10m" # Supports s, m, h units (e.g., "30s", "5m", "1h")- Kubernetes 1.28 or later (for native sidecar support with
restartPolicy: Always) - The validation container will consume resources continuously (CPU/memory)
- Consider longer intervals (e.g., 10m, 1h) for production workloads
The example folder contains example files for testing the operator.
See examples/continuous-validation.yaml for a complete example.
Before running the examples, create a namespace for testing (separate from the operator namespace):
kubectl create namespace testingImportant: Do not deploy examples in the operator namespace (e.g., model-validation-operator-system). The operator namespace has the label validation.ml.sigstore.dev/ignore: "true" which prevents the webhook from processing pods in that namespace.
- prepare.yaml: Contains a persistent volume claim and a job that downloads a signed test model.
kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/prepare.yaml -n testing # or local kubectl apply -f examples/prepare.yaml -n testing- verify.yaml: Contains a model validation manifest for the validation of this model and a demo pod, which is provided with the appropriate label for validation.
kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/verify.yaml -n testing # or local kubectl apply -f examples/verify.yaml -n testing- unsigned.yaml: Contains an example of a pod that would fail validation (for testing purposes).
kubectl apply -f https://raw.githubusercontent.com/sigstore/model-validation-operator/main/examples/unsigned.yaml -n testing # or local kubectl apply -f examples/unsigned.yaml -n testingAfter the example installation, the logs of the generated job should show a successful download:
$ kubectl logs -n testing job/download-extract-model Connecting to github.com (140.82.121.3:443) Connecting to objects.githubusercontent.com (185.199.108.133:443) saving to '/data/tensorflow_saved_model.tar.gz' tensorflow_saved_mod 44% |************** | 3983k 0:00:01 ETA tensorflow_saved_mod 100% |********************************| 8952k 0:00:00 ETA '/data/tensorflow_saved_model.tar.gz' saved ./ ./model.sig ./variables/ ./variables/variables.data-00000-of-00001 ./variables/variables.index ./saved_model.pb ./fingerprint.pbThe operator logs should show that a pod has been modified:
$ kubectl logs -n model-validation-operator-system deploy/model-validation-controller-manager time=2025-01-20T22:13:05.051Z level=INFO msg="Starting webhook server on :9443" time=2025-01-20T22:13:47.556Z level=INFO msg="new request, path: /mutate-v1-pod" time=2025-01-20T22:13:47.557Z level=INFO msg="Execute webhook" time=2025-01-20T22:13:47.560Z level=INFO msg="Search associated Model Validation CR" pod=whatever-workload namespace=testing time=2025-01-20T22:13:47.591Z level=INFO msg="construct args" time=2025-01-20T22:13:47.591Z level=INFO msg="found sigstore config"Finally, the test pod should be running and the injected initcontainer should have been successfully validated.
$ kubectl logs -n testing whatever-workload model-validation INFO:__main__:Creating verifier for sigstore INFO:tuf.api._payload:No signature for keyid f5312f542c21273d9485a49394386c4575804770667f2ddb59b3bf0669fddd2f INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:tuf.api._payload:No signature for keyid ff51e17fcf253119b7033f6f57512631da4a0969442afcf9fc8b141c7f2be99c INFO:__main__:Verifying model signature from /data/model.sig INFO:__main__:all checks passedIn case the workload is modified, is not executed:
ERROR:__main__:verification failed: the manifests do not matchThe model section of the ModelValidation CR supports additional options to control which files are included during verification:
| Field | Type | Description |
|---|---|---|
ignorePaths | []string | List of file paths to exclude from verification |
ignoreGitPaths | bool | When true, excludes git-related files (e.g., .git/, .gitignore) |
ignoreUnsignedFiles | bool | When true, unsigned files will not cause verification to fail |
allowSymlinks | bool | When true, symbolic links will be followed and their targets verified |
Example with ignore options:
apiVersion: ml.sigstore.dev/v1alpha1 kind: ModelValidation metadata: name: demo spec: config: sigstoreConfig: certificateIdentity: "https://github.com/sigstore/model-validation-operator/.github/workflows/sign-model.yaml@refs/tags/v0.0.2" certificateOidcIssuer: "https://token.actions.githubusercontent.com" model: path: /data/tensorflow_saved_model signaturePath: /data/tensorflow_saved_model/model.sig ignorePaths: - /data/tensorflow_saved_model/cache - /data/tensorflow_saved_model/tmp ignoreGitPaths: true allowSymlinks: trueIgnore options can also be specified or overridden on individual pods using annotations. Pod annotations take precedence over the ModelValidation CR settings.
| Annotation | Value | Description |
|---|---|---|
validation.ml.sigstore.dev/ignore-paths | Comma-separated paths | Paths to exclude from verification |
validation.ml.sigstore.dev/ignore-git-paths | "true" or "false" | Exclude git-related files |
validation.ml.sigstore.dev/ignore-unsigned-files | "true" or "false" | Allow unsigned files |
validation.ml.sigstore.dev/allow-symlinks | "true" or "false" | Follow symbolic links |
Example pod with annotation overrides:
apiVersion: v1 kind: Pod metadata: name: whatever-workload labels: validation.ml.sigstore.dev/ml: "demo" annotations: validation.ml.sigstore.dev/ignore-paths: "/data/tensorflow_saved_model/logs,/data/tensorflow_saved_model/tmp" validation.ml.sigstore.dev/ignore-git-paths: "true" spec: # ... rest of pod spec