aigw-fundamentals
SKILL.md
Envoy AI Gateway Fundamentals
Envoy AI Gateway extends Envoy Gateway to provide a unified API gateway for generative AI services. It translates between client-facing APIs (e.g., OpenAI-compatible) and backend-specific APIs (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, GCP Vertex AI, Cohere, etc.).
Resource Hierarchy
GatewayClass (Gateway API)
-> Gateway (Gateway API)
-> AIGatewayRoute (AI Gateway CRD)
-> rules with matches + backendRefs
-> AIServiceBackend (AI Gateway CRD)
-> Backend (Envoy Gateway) or InferencePool (Gateway API extension)
Core AI Gateway CRDs (aigateway.envoyproxy.io/v1alpha1)
| CRD | Purpose |
|---|---|
| AIGatewayRoute | Binds AI backends to a Gateway. Defines routing rules (header matches, e.g. x-ai-eg-model), backend refs, timeouts, and optional LLM cost capture. Generates HTTPRoute and HTTPRouteFilter under the hood. |
| AIServiceBackend | Describes a single AI backend: its API schema and the Envoy Gateway Backend it attaches to. backendRef must be a Backend (gateway.envoyproxy.io); it cannot reference a Kubernetes Service directly. Use a Backend with FQDN endpoints (e.g., to a K8s service DNS) for in-cluster backends. |
| BackendSecurityPolicy | Backend authentication: API key, AWS credentials, Azure credentials, GCP credentials, Anthropic API key. Attaches to AIServiceBackend or InferencePool. Only one BackendSecurityPolicy can target a given AIServiceBackend or InferencePool; multiple policies cause reconciliation failure. |
| GatewayConfig | Gateway-scoped ExtProc config (resources, env vars). Reference via annotation aigateway.envoyproxy.io/gateway-config: <name> on the Gateway. Same namespace as Gateway. |
| MCPRoute | Model Context Protocol routing for MCP tools. |
| QuotaPolicy | Rate limiting and quota management. |
Envoy Gateway Resources Used by AI Gateway
- Gateway, GatewayClass, HTTPRoute — standard Gateway API
- Backend — external endpoints (FQDN, port) for AI providers
- BackendTLSPolicy — TLS validation for Backend (use
gateway.networking.k8s.io/v1with Envoy Gateway v1.6+) - ClientTrafficPolicy — client-facing settings (buffer limits, timeouts). Required for AI: set
connection.bufferLimit(e.g.,50Mi) because default 32KiB is too small for AI requests. - EnvoyProxy — customizes Envoy deployment
API Schemas (schema.name in AIServiceBackend)
Supported values (from ai-gateway codebase):
- OpenAI — OpenAI API, OpenAI-compatible backends
- Cohere — Cohere API
- AWSBedrock — AWS Bedrock
- AzureOpenAI — Azure OpenAI
- GCPVertexAI — GCP Vertex AI (Gemini)
- GCPAnthropic — Anthropic on GCP Vertex AI
- Anthropic — Native Anthropic API
- AWSAnthropic — Anthropic on AWS Bedrock
Routing Model
- x-ai-eg-model header: The AI Gateway filter extracts the model from the request body and injects it into this header. Use it in AIGatewayRoute
matchesto route by model. - BackendRefs: Reference AIServiceBackend by name (default). Can also reference InferencePool (
group: inference.networking.k8s.io,kind: InferencePool) for self-hosted models. - Priority: Use
priorityin backendRefs for failover (lower number = higher priority). - Weight: Use
weightfor traffic splitting across backends.
BackendSecurityPolicy Types
| Type | Use Case |
|---|---|
| APIKey | OpenAI, generic API key in Authorization header |
| AnthropicAPIKey | Anthropic (x-api-key header) |
| AzureAPIKey | Azure OpenAI (api-key header) |
| AzureCredentials | Azure OpenAI with OAuth/client secret |
| AWSCredentials | AWS Bedrock (IRSA, Pod Identity, or credentials file) |
| GCPCredentials | GCP Vertex AI (service account or workload identity) |
Two-Tier Gateway Pattern
- Tier One Gateway: Central entry point; handles auth, top-level routing, global rate limiting.
- Tier Two Gateway: Fine-grained control over self-hosted models; InferencePool with endpoint picker for LLM optimization.
Naming Conventions
- Use kebab-case for resource names
- AIServiceBackend and Backend often share the same name for clarity
- BackendSecurityPolicy names typically indicate provider:
my-backend-openai-apikey
Implementation Notes
- AIGatewayRoute generates an HTTPRoute (same name) and HTTPRouteFilters (host rewrite, 404 fallback). The AI Gateway controller uses the Envoy Gateway Extension Server to fine-tune xDS.
- ExtProc sidecar: AI Gateway injects an External Processor as a sidecar in the Envoy Proxy Pod. It reads a filter config Secret, performs request/response transformation, and injects provider credentials. Model name is extracted from the request body and set in
x-ai-eg-modelfor routing.
Checklist
- Understand AIGatewayRoute → AIServiceBackend → Backend chain (Backend required, not K8s Service)
- Know which schema.name matches your provider
- At most one BackendSecurityPolicy per AIServiceBackend or InferencePool
- BackendSecurityPolicy required for cloud providers (OpenAI, Anthropic, AWS, Azure, GCP)
- ClientTrafficPolicy with bufferLimit for AI workloads
- BackendTLSPolicy for HTTPS backends (hostname validation)
Weekly Installs
1
Repository
missberg/envoy-skillsFirst Seen
5 days ago
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1