Overview

Relevant source files

This document provides a technical overview of the Portkey AI Gateway, an open-source proxy system for routing requests to 250+ language, vision, audio, and image models from 70+ AI providers. It explains the gateway's architecture, core components, request processing pipeline, and extension mechanisms.

Scope: This page covers the high-level system architecture and major subsystems. For detailed information on specific topics:

Request routing strategies (fallback, load balancing, conditional) → 3.1
Provider integration details → 5
Hooks and plugins system → 6
Deployment options → 7

The gateway is built on Hono, a lightweight web framework, and supports multiple runtime environments including Node.js, Cloudflare Workers, Bun, and Deno.

Gateway Purpose and Capabilities

The AI Gateway serves as a unified interface for interacting with AI providers, abstracting away provider-specific APIs while adding reliability, security, and observability features. Key capabilities include:

Feature Category	Capabilities
Provider Support	70+ providers including OpenAI, Anthropic, Google Vertex AI, AWS Bedrock, Azure OpenAI, Cohere, and regional providers
Routing	Fallback, load balancing, conditional routing, single-target modes with recursive target support
Reliability	Automatic retries (up to 5 attempts), request timeouts, exponential backoff
Caching	Simple and semantic caching modes with configurable TTL
Security	SSRF protection, provider validation, schema enforcement, guardrails
Extensibility	Hooks system with 22+ built-in plugins for validation and transformation
Observability	Request/response logging, usage analytics, SSE log streaming (Node.js)

Sources: README.md35-51 README.md192-209

High-Level System Architecture

The gateway is organized as a layered architecture with clear separation of concerns:

Sources: src/providers/index.ts78-151 src/globals.ts117-192 src/middlewares/requestValidator/index.ts80-230

Request Processing Flow

Requests pass through multiple stages from entry to provider execution and response transformation:

The flow involves several decision points:

Strategy Selection: Based on strategy.mode in configuration, the system chooses between single, loadbalance, fallback, or conditional routing
Cache Hit/Miss: If caching is enabled and a cache hit occurs, provider execution is skipped
Hook Denial: Guardrail hooks can deny requests at beforeRequestHooks or afterRequestHooks stages
Retry Logic: Failed requests may be retried up to MAX_RETRIES times based on status codes

Sources: src/globals.ts38-41 src/middlewares/requestValidator/index.ts80-104

Provider Integration System

The gateway supports 70+ providers through a standardized interface pattern. Each provider implements two key interfaces:

Provider Registry Structure

Key Provider Examples:

Provider	Configuration File	Base URL
OpenAI	`providers/openai/`	`https://api.openai.com/v1`
Anthropic	`providers/anthropic/`	`https://api.anthropic.com/v1`
Google Vertex AI	`providers/google-vertex-ai/`	`https://{region}-aiplatform.googleapis.com/v1`
AWS Bedrock	`providers/bedrock/`	Constructed from AWS region
Azure OpenAI	`providers/azure-openai/`	`https://{resource}.openai.azure.com`
IO Intelligence	`providers/iointelligence/`	`https://api.intelligence.io.solutions/api/v1`

Sources: src/providers/index.ts1-154 src/providers/iointelligence/api.ts1-26

Configuration System

The gateway accepts configuration through HTTP headers and JSON configurations validated with Zod schemas.

Header-Based Configuration

The HEADER_KEYS constant defines all supported headers:

Sources: src/globals.ts13-28

Configuration Schema

The configSchema validates complex routing configurations with recursive target support:

Top-Level Configuration Fields:

Field	Type	Purpose	Reference
`strategy`	Object	Routing strategy configuration (mode, conditions, on_status_codes)	2.3
`provider`	String	Provider identifier (must be in `VALID_PROVIDERS`)	5
`api_key`	String	Provider API key
`cache`	Object	Caching configuration (mode: simple/semantic, max_age)	3.3
`retry`	Object	Retry configuration (attempts, on_status_codes)	3.2
`targets`	Array	Recursive array of configurations for nested routing	2.3
`request_timeout`	Number	Request timeout in milliseconds	3.4
`custom_host`	String	Custom base URL for provider (requires validation)	4.5
`before_request_hooks`	Array	Hooks to execute before provider call	6.2
`after_request_hooks`	Array	Hooks to execute after provider call	6.2
`input_guardrails`	Array	Input validation rules	6.5
`output_guardrails`	Array	Output validation rules	6.5

Sources: src/middlewares/requestValidator/schema/config.ts11-179

Security and Validation

Request Validation Pipeline

The requestValidator middleware performs comprehensive security checks:

Sources: src/middlewares/requestValidator/index.ts80-230

SSRF Protection

The isValidCustomHost function implements comprehensive SSRF protection with 15+ attack vector checks:

Protection Categories:

Scheme Validation: Only http:// and https:// allowed; blocks file://, data:, gopher://, ftp://
Private IP Ranges: Blocks RFC1918 private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
Reserved IP Ranges: Blocks loopback (127.0.0.0/8), link-local (169.254.0.0/16), CGNAT (100.64.0.0/10)
Cloud Metadata Endpoints: Blocks 169.254.169.254, metadata.google.internal, metadata.azure.com
Internal TLDs: Blocks .local, .internal, .intranet, .corp, .test
Alternative IP Representations: Detects and blocks decimal (2130706433), hex (0x7f000001), octal (0177.0.0.1)
IPv6 Private Ranges: Blocks link-local (fe80::/10), ULA (fc00::/7), IPv4-mapped addresses
URL Obfuscation: Blocks credentials in URLs, encoded characters in hostname, suspicious characters
DNS Rebinding: Blocks excessive subdomain depth (>10 levels), trailing dots
Homograph Attacks: Validates ASCII-only hostnames to prevent Unicode lookalikes

The function uses pre-computed IP range boundaries for performance:

Trusted Hosts: The system allows configuration of trusted hosts via the TRUSTED_CUSTOM_HOSTS environment variable (comma-separated list). By default, it allows localhost, 127.0.0.1, ::1, and host.docker.internal for local development.

Sources: src/middlewares/requestValidator/index.ts6-450 src/utils/env.ts133-135

Core Components and Subsystems

Constants and Global Configuration

The globals.ts file defines system-wide constants:

Key Constants:

Sources: src/globals.ts1-310

Provider Registry

The Providers object in providers/index.ts maps provider identifiers to their configurations:

Each provider configuration includes:

api: ProviderAPIConfig with getBaseURL, getEndpoint, headers functions
Function-specific configs: chatComplete, embed, complete, etc.
responseTransforms: Functions to normalize responses to OpenAI format

Sources: src/providers/index.ts78-153 src/providers/iointelligence/index.ts1-27

Environment Variable Handling

The Environment utility provides runtime-specific environment variable access with file path resolution:

Features:

File Path Resolution: Values starting with /, ./, or ../ are treated as file paths and their contents are read
Runtime Detection: Returns Node.js environment variables or Hono context-based environment
Sensitive Value Support: Supports reading API keys, certificates, and tokens from files

Supported Environment Variables (subset):

PORT, NODE_ENV AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION AZURE_AUTH_MODE, AZURE_ENTRA_CLIENT_ID, AZURE_ENTRA_CLIENT_SECRET HTTP_PROXY, HTTPS_PROXY TRUSTED_CUSTOM_HOSTS

Sources: src/utils/env.ts1-147

System Characteristics

Characteristic	Details
Performance	<1ms latency overhead, 122KB footprint
Scale	Processes 10B+ tokens daily in production
Runtime Support	Node.js, Cloudflare Workers, Bun, Deno
Provider Count	70+ providers, 250+ models
Reliability	Built-in retries, fallbacks, timeouts
Security	SSRF protection, schema validation, SOC2/HIPAA/GDPR compliant (enterprise)
Extensibility	Hooks system with plugin architecture
Observability	SSE log streaming, usage analytics (Node.js runtime)

Sources: README.md35-51

Next Steps

For detailed information on specific subsystems:

Core Architecture → 2: Deep dive into application entry points, middleware pipeline, and service layer
Routing Strategies → 3: Fallback, load balancing, conditional routing, retry mechanisms
Configuration → 4: Configuration schema, inheritance, environment variables
Provider Integration → 5: Provider architecture, specific provider implementations
Hooks and Plugins → 6: Extension system, built-in plugins, custom plugin development
Deployment → 7: Runtime environments, build pipeline, monitoring

Sources: README.md1-339