This document provides a technical overview of the Portkey AI Gateway, an open-source proxy system for routing requests to 250+ language, vision, audio, and image models from 70+ AI providers. It explains the gateway's architecture, core components, request processing pipeline, and extension mechanisms.
Scope: This page covers the high-level system architecture and major subsystems. For detailed information on specific topics:
The gateway is built on Hono, a lightweight web framework, and supports multiple runtime environments including Node.js, Cloudflare Workers, Bun, and Deno.
The AI Gateway serves as a unified interface for interacting with AI providers, abstracting away provider-specific APIs while adding reliability, security, and observability features. Key capabilities include:
| Feature Category | Capabilities |
|---|---|
| Provider Support | 70+ providers including OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, Azure OpenAI, Cohere, and regional providers |
| Routing | Fallback, load balancing, conditional routing, single-target modes with recursive target support |
| Reliability | Automatic retries (up to 5 attempts), request timeouts, exponential backoff |
| Caching | Simple and semantic caching modes with configurable TTL |
| Security | SSRF protection, provider validation, schema enforcement, guardrails |
| Extensibility | Hooks system with 22+ built-in plugins for validation and transformation |
| Observability | Request/response logging, usage analytics, SSE log streaming (Node.js) |
Sources: README.md35-51 README.md192-209
The gateway is organized as a layered architecture with clear separation of concerns:
Sources: src/providers/index.ts78-151 src/globals.ts117-192 src/middlewares/requestValidator/index.ts80-230
Requests pass through multiple stages from entry to provider execution and response transformation:
The flow involves several decision points:
strategy.mode in configuration, the system chooses between single, loadbalance, fallback, or conditional routingbeforeRequestHooks or afterRequestHooks stagesMAX_RETRIES times based on status codesSources: src/globals.ts38-41 src/middlewares/requestValidator/index.ts80-104
The gateway supports 70+ providers through a standardized interface pattern. Each provider implements two key interfaces:
Key Provider Examples:
| Provider | Configuration File | Base URL |
|---|---|---|
| OpenAI | providers/openai/ | https://api.openai.com/v1 |
| Anthropic | providers/anthropic/ | https://api.anthropic.com/v1 |
| Google Vertex AI | providers/google-vertex-ai/ | https://{region}-aiplatform.googleapis.com/v1 |
| AWS Bedrock | providers/bedrock/ | Constructed from AWS region |
| Azure OpenAI | providers/azure-openai/ | https://{resource}.openai.azure.com |
| IO Intelligence | providers/iointelligence/ | https://api.intelligence.io.solutions/api/v1 |
Sources: src/providers/index.ts1-154 src/providers/iointelligence/api.ts1-26
The gateway accepts configuration through HTTP headers and JSON configurations validated with Zod schemas.
The HEADER_KEYS constant defines all supported headers:
Sources: src/globals.ts13-28
The configSchema validates complex routing configurations with recursive target support:
Top-Level Configuration Fields:
| Field | Type | Purpose | Reference |
|---|---|---|---|
strategy | Object | Routing strategy configuration (mode, conditions, on_status_codes) | 2.3 |
provider | String | Provider identifier (must be in VALID_PROVIDERS) | 5 |
api_key | String | Provider API key | |
cache | Object | Caching configuration (mode: simple/semantic, max_age) | 3.3 |
retry | Object | Retry configuration (attempts, on_status_codes) | 3.2 |
targets | Array | Recursive array of configurations for nested routing | 2.3 |
request_timeout | Number | Request timeout in milliseconds | 3.4 |
custom_host | String | Custom base URL for provider (requires validation) | 4.5 |
before_request_hooks | Array | Hooks to execute before provider call | 6.2 |
after_request_hooks | Array | Hooks to execute after provider call | 6.2 |
input_guardrails | Array | Input validation rules | 6.5 |
output_guardrails | Array | Output validation rules | 6.5 |
Sources: src/middlewares/requestValidator/schema/config.ts11-179
The requestValidator middleware performs comprehensive security checks:
Sources: src/middlewares/requestValidator/index.ts80-230
The isValidCustomHost function implements comprehensive SSRF protection with 15+ attack vector checks:
Protection Categories:
http:// and https:// allowed; blocks file://, data:, gopher://, ftp://169.254.169.254, metadata.google.internal, metadata.azure.com.local, .internal, .intranet, .corp, .testThe function uses pre-computed IP range boundaries for performance:
Trusted Hosts: The system allows configuration of trusted hosts via the TRUSTED_CUSTOM_HOSTS environment variable (comma-separated list). By default, it allows localhost, 127.0.0.1, ::1, and host.docker.internal for local development.
Sources: src/middlewares/requestValidator/index.ts6-450 src/utils/env.ts133-135
The globals.ts file defines system-wide constants:
Key Constants:
Sources: src/globals.ts1-310
The Providers object in providers/index.ts maps provider identifiers to their configurations:
Each provider configuration includes:
api: ProviderAPIConfig with getBaseURL, getEndpoint, headers functionschatComplete, embed, complete, etc.responseTransforms: Functions to normalize responses to OpenAI formatSources: src/providers/index.ts78-153 src/providers/iointelligence/index.ts1-27
The Environment utility provides runtime-specific environment variable access with file path resolution:
Features:
/, ./, or ../ are treated as file paths and their contents are readSupported Environment Variables (subset):
PORT, NODE_ENV AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION AZURE_AUTH_MODE, AZURE_ENTRA_CLIENT_ID, AZURE_ENTRA_CLIENT_SECRET HTTP_PROXY, HTTPS_PROXY TRUSTED_CUSTOM_HOSTS Sources: src/utils/env.ts1-147
| Characteristic | Details |
|---|---|
| Performance | <1ms latency overhead, 122KB footprint |
| Scale | Processes 10B+ tokens daily in production |
| Runtime Support | Node.js, Cloudflare Workers, Bun, Deno |
| Provider Count | 70+ providers, 250+ models |
| Reliability | Built-in retries, fallbacks, timeouts |
| Security | SSRF protection, schema validation, SOC2/HIPAA/GDPR compliant (enterprise) |
| Extensibility | Hooks system with plugin architecture |
| Observability | SSE log streaming, usage analytics (Node.js runtime) |
Sources: README.md35-51
For detailed information on specific subsystems:
Sources: README.md1-339
Refresh this wiki