oci-pull-through

A pull-through cache for OCI container registries. It sits between your container runtime and upstream registries, transparently caching image layers and manifests on first pull. Subsequent pulls for the same content are served from the cache without contacting the upstream registry.

This exists because pulling the same images repeatedly across a fleet of machines is wasteful. Rate limits, network latency, and registry outages compound the problem. A pull-through cache eliminates redundant transfers and provides a degree of resilience against upstream unavailability for previously-cached content.

How it works

The proxy implements the OCI Distribution Spec read path. The upstream registry hostname is encoded in the request path:

GET /v2/{registry}/{image}/manifests/{reference} GET /v2/{registry}/{image}/blobs/{digest}

For example, pulling ghcr.io/org/app:v1.2.3 through the proxy running on cache.internal:8080:

docker pull cache.internal:8080/ghcr.io/org/app:v1.2.3

On a cache miss, the proxy fetches from the upstream registry and simultaneously streams the response to the client and the cache store. The client is never blocked by cache writes -- if the cache store is slow or fails, the client stream continues uninterrupted.

On a cache hit with the S3 backend, the proxy returns an HTTP 307 redirect to a presigned S3 URL. The client fetches the blob directly from S3, removing the proxy from the data path entirely. This avoids the double-bandwidth penalty (S3→proxy→client) that streaming would incur. The OCI distribution spec explicitly allows 307 redirects for blob GETs, and Docker/containerd clients handle them correctly.

The filesystem backend continues to stream directly from disk (with full Range/206 support via http.ServeContent).

All upstream response headers (excluding hop-by-hop headers) are stored alongside the cached object and replayed on cache hits, making the proxy transparent to clients that depend on headers like ETag or Accept-Ranges.

Caching behaviour

Content-addressed objects (blobs and manifests resolved by digest) are immutable. They are always cached and served with Cache-Control: public, max-age=31536000, immutable.

Tag references are mutable -- a tag can point to a different digest at any time. Caching of tag manifests is therefore optional and controlled by configuration:

Scenario	Cached	Condition
Blob (`/blobs/sha256:...`)	Always	Immutable
Manifest by digest	Always	Immutable
Manifest by tag	Configurable	`CACHE_TAG_MANIFESTS=true`
Manifest by `latest`	Configurable	Both tag and latest flags

When tag manifests are cached, they are served with Cache-Control: public, max-age=2419200 (28 days). The latest tag uses a shorter Cache-Control: public, max-age=3600 (1 hour) to balance freshness with upstream rate limits.

Non-2xx upstream responses are forwarded to the client as-is and are never cached.

Proxy modes

The proxy operates in one of two mutually exclusive modes, controlled by the required PROXY_MODE environment variable. Setting both or neither is a startup error.

Proxy Mode: `transparent`

Maximum availability, no auth enforcement. The proxy is an unauthenticated cache that serves whatever it has. Ideal for private Kubernetes clusters where the network is the security boundary.

/v2/ check: tries upstream; if unreachable, returns a static 200 OK so clients can proceed with cached content.
HEAD (cache hit): served immediately from cache. Auth header ignored.
GET (cache hit): served from cache (S3 redirect or FS stream) immediately. Auth header ignored.
Cache miss with upstream down: 502 Bad Gateway — can't serve what we don't have.
Cache miss with upstream up: forwarded to upstream with the client's auth header. Response is tee-streamed to cache.

Security implications:

Any client that can reach the proxy can pull any cached content.
No token validation occurs on cache hits.
The /v2/ auth challenge is forwarded when upstream is up (clients still authenticate with upstream on cache misses), but when upstream is down auth is skipped entirely.
Secure this mode with network policy, private subnets, or an authenticating reverse proxy in front.

Proxy Mode: `authenticated`

Auth is always validated against upstream. The cache accelerates delivery of large layers but never bypasses access control. Upstream must be reachable for all requests.

/v2/ check: always forwarded to upstream. If unreachable → 502 Bad Gateway.
HEAD: always forwarded to upstream with the client's auth. Cache is not consulted — upstream HEAD is lightweight and gives the freshest headers.
GET (cache hit): before serving from cache, a HEAD request is sent to upstream for the same resource with the client's auth:
- Upstream 200 → auth valid, serve body from cache.
- Upstream 401/403 → forwarded to client (auth rejected).
- Upstream 404 → forwarded to client (resource removed upstream).
- Upstream unreachable → 502 (no degraded fallback).
GET (cache miss): forwarded to upstream with auth. Response is tee-streamed to cache.

Performance characteristics:

Every cache-hit GET adds one upstream HEAD round-trip (~100-200ms).
But the blob body comes from local S3/FS instead of the internet.
For large images (1GB+ layers), the HEAD overhead is negligible compared to bandwidth savings.
HEAD requests from clients are always forwarded to upstream (no cache benefit for HEAD).

Configuration

All configuration is via environment variables.

Variable	Default	Description
`PROXY_MODE`	(required)	`transparent` or `authenticated`. See Proxy modes.
`STORAGE_BACKEND`	`s3`	Storage backend. `s3` or `fs`.
`LISTEN_ADDR`	`:8080` (`:8443` with TLS)	Listen address.
`GENERATE_SELF_SIGNED_TLS`	`false`	Generate a self-signed TLS certificate on startup.
`LOG_LEVEL`	`info`	`debug`, `info`, `warn`, `error`.
`CACHE_TAG_MANIFESTS`	`true`	Cache manifests resolved by tag.
`CACHE_LATEST_TAG`	`false`	Cache the `latest` tag.

S3 backend

Variable	Default	Description
`S3_BUCKET`	`oci-cache`	Bucket name. Auto-created.
`S3_PREFIX`	--	Key prefix for all objects. Allows multiple proxy instances to share a bucket.
`S3_FORCE_PATH_STYLE`	`true`	Path-style S3 URLs.
`S3_LIFECYCLE_DAYS`	`28`	Expire cached objects after this many days. `0` disables.
`AWS_ACCESS_KEY_ID`	--	Standard SDK credential chain.
`AWS_SECRET_ACCESS_KEY`	--	Standard SDK credential chain.
`AWS_REGION`	--	Standard SDK credential chain.
`AWS_ENDPOINT_URL`	--	S3-compatible endpoint override.

Credentials, region, and endpoint are resolved through the standard AWS SDK default credential chain. IAM instance profiles, ECS task roles, and ~/.aws/credentials all work as expected.

Shared buckets

Multiple proxy instances (each fronting a different upstream registry) can share a single S3 bucket by setting S3_PREFIX:

# Instance 1: ghcr.io proxy S3_BUCKET=oci-cache S3_PREFIX=ghcr UPSTREAM_REGISTRY=https://ghcr.io ... # Instance 2: Docker Hub proxy S3_BUCKET=oci-cache S3_PREFIX=dockerhub UPSTREAM_REGISTRY=https://registry-1.docker.io ...

Objects are stored under {prefix}/blobs/... and {prefix}/manifests/.... The lifecycle policy is scoped to the prefix, so each instance manages its own expiry independently.

Filesystem backend

Variable	Default	Description
`FS_ROOT`	`/data/oci-cache`	Root directory for cache.

Objects are stored as files with .meta.json sidecar files containing content metadata and the full set of upstream response headers. Writes are atomic (temp file + rename). The S3 backend uses the same .meta.json sidecar pattern (stored as a separate S3 object alongside the data object) for parity between backends.

Running

Docker Compose (development)

The included docker-compose.yml runs the proxy with SeaweedFS as an S3-compatible backend:

docker compose up

The proxy is available on localhost:8080. SeaweedFS provides S3 on port 8333.

Container image

Images are built with ko using a gcr.io/distroless/static-debian12:nonroot base image.

Build a local image:

KO_DOCKER_REPO=ko.local ko build ./cmd/oci-pull-through

Run it:

docker run -p 8080:8080 \ -e STORAGE_BACKEND=s3 \ -e AWS_ENDPOINT_URL=http://your-s3:9000 \ -e AWS_ACCESS_KEY_ID=access \ -e AWS_SECRET_ACCESS_KEY=secret \ ko.local/oci-pull-through

Binary

go build -o oci-pull-through ./cmd/oci-pull-through STORAGE_BACKEND=fs FS_ROOT=/var/cache/oci ./oci-pull-through

Health check

GET /healthz returns 200 OK when the server is accepting connections.

For scratch containers (no shell, no curl), the binary includes a built-in health check client:

oci-pull-through -healthcheck

This is what the Docker Compose healthcheck uses. Exit code 0 on success, 1 on failure.

API endpoints

Method	Path	Description
`GET`	`/healthz`	Health check.
`GET`	`/v2/`	OCI version check.
`GET`, `HEAD`	`/v2/{reg}/{name}/manifests/{ref}`	Manifest.
`GET`, `HEAD`	`/v2/{reg}/{name}/blobs/{digest}`	Blob.
`GET`	`/v2/{reg}/{name}/referrers/{digest}`	Referrers (proxied to upstream).

The proxy supports multi-segment image names (e.g., /v2/ghcr.io/org/sub/image/manifests/latest).

docker.io is automatically resolved to registry-1.docker.io for upstream requests.

Protocol

By default the proxy serves both HTTP/1.1 and cleartext HTTP/2 (h2c) on the same port. TLS termination is expected to be handled by a reverse proxy or load balancer in front of this service.

Self-signed TLS

Setting GENERATE_SELF_SIGNED_TLS=true generates an in-memory ECDSA P-256 self-signed certificate on startup (valid for 10 years, with SANs for localhost, host.docker.internal, 127.0.0.1, and ::1). The server switches to HTTPS with HTTP/2 and the default listen address changes to :8443.

This is useful for local development where the Docker daemon requires HTTPS to pull from a registry. No certificate files are written to disk.

Docker Desktop (macOS / Windows)

On Docker Desktop the daemon runs inside a Linux VM. The VM's loopback (127.0.0.1 / [::1]) does not reach the host, so localhost:8443 will not work. Use host.docker.internal instead:

docker pull host.docker.internal:8443/docker.io/library/postgres:13

You will also need to add the registry to Docker's insecure registries list (since the certificate is self-signed). In Docker Desktop go to Settings → Docker Engine and add:

{ "insecure-registries": ["host.docker.internal:8443"] }

Then apply and restart Docker Desktop.

Authorization headers from the client are forwarded to the upstream registry as-is. The proxy does not perform authentication or token exchange. If your upstream registry requires authentication, the client must provide valid credentials.

Signals

The process handles SIGINT and SIGTERM for graceful shutdown with a 30-second drain timeout.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
cmd/oci-pull-through		cmd/oci-pull-through
internal		internal
.gitignore		.gitignore
.goreleaser.yml		.goreleaser.yml
.ko.yaml		.ko.yaml
.releaserc		.releaserc
README.md		README.md
Taskfile.yml		Taskfile.yml
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

oci-pull-through

How it works

Caching behaviour

Proxy modes

Proxy Mode: `transparent`

Proxy Mode: `authenticated`

Configuration

S3 backend

Shared buckets

Filesystem backend

Running

Docker Compose (development)

Container image

Binary

Health check

API endpoints

Protocol

Self-signed TLS

Docker Desktop (macOS / Windows)

Signals

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

oci-pull-through

How it works

Caching behaviour

Proxy modes

Proxy Mode: transparent

Proxy Mode: authenticated

Configuration

S3 backend

Shared buckets

Filesystem backend

Running

Docker Compose (development)

Container image

Binary

Health check

API endpoints

Protocol

Self-signed TLS

Docker Desktop (macOS / Windows)

Signals

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Proxy Mode: `transparent`

Proxy Mode: `authenticated`

Packages