eg-production-guide
SKILL.md
Envoy Gateway Production Deployment
Deployment Modes
Single Tenant (Default)
- One GatewayClass per Envoy Gateway controller
- Simplest model; suitable for most single-team deployments
- All Gateways share the same controller and Envoy Proxy fleet
Multi-Tenant
- Deploy separate Envoy Gateway controllers per tenant namespace
- Each controller must have a unique controller name in its GatewayClass
- Provides strong tenant isolation at the control plane level
- Install via separate Helm releases with distinct
--set config.envoyGateway.gateway.controllerName=...
Gateway Namespace Mode
- Envoy Proxy pods deploy in the Gateway's namespace instead of the controller namespace
- Provides stronger workload isolation: proxy runs alongside the application
- Enables JWT authentication between proxy and controller for hardened communication
- Enable with
envoyGateway.provider.kubernetes.deploy.type: Namespace
Merged Gateways
- Merge listeners from multiple Gateway resources into a single Envoy Proxy fleet
- All merged Gateways share a single IP address / load balancer
- Useful for consolidating ingress when teams own different Gateways but share infrastructure
- Enable with
mergeGateways: trueon the GatewayClass parametersRef (EnvoyProxy)
Performance Tuning
- Connection timeouts: set explicitly in ClientTrafficPolicy and BackendTrafficPolicy. Never rely on Envoy defaults.
timeout.http.requestTimeout— total time for the client to send a complete requesttimeout.http.idleTimeout— close connections idle longer than this
- HTTP/2 max concurrent streams: limit to 100 to prevent a single connection from monopolizing resources
- Buffer limits: set to 32 KiB for both listener and cluster buffers to cap memory under load
- Configure via EnvoyProxy
spec.bootstrapor EnvoyPatchPolicy
- Configure via EnvoyProxy
- Resource requests/limits: always set CPU and memory on Envoy Proxy pods via EnvoyProxy
spec.provider.kubernetes.envoyDeployment.container.resources - Horizontal scaling: use HPA on the Envoy Proxy Deployment; scale on CPU utilization (target 60-70%)
- Keep-alive: enable TCP keep-alive on backend connections to avoid connection resets through cloud load balancers
Observability
Access Logging
- Configure via EnvoyProxy
spec.telemetry.accessLog - Sinks: File (stdout/path) or OpenTelemetry (gRPC collector)
- Use structured JSON format for machine parsing
- Include at minimum: method, path, response code, duration, upstream host
Metrics
- Expose Prometheus metrics via EnvoyProxy
spec.telemetry.metrics - Scrape from Envoy Proxy pods on the admin port (default 19001)
- Key metrics:
envoy_http_downstream_rq_total,envoy_http_downstream_rq_xx,envoy_cluster_upstream_rq_time - Enable Envoy Gateway controller metrics for control plane health
Tracing
- Configure distributed tracing via EnvoyProxy
spec.telemetry.tracing - Export to OpenTelemetry collector (gRPC or HTTP)
- Set appropriate sampling rate: 1-10% in production, 100% in staging
- Propagate trace context headers (
traceparent,tracestate)
Operations
Installation
helm install eg oci://docker.io/envoyproxy/gateway-helm \
--version v1.7.0 \
-n envoy-gateway-system \
--create-namespace
- Always pin Helm chart versions — never use
latestor omit--version - Use
helm upgrade --installfor idempotent deployments
GitOps
- Manage all Gateway API resources via GitOps (ArgoCD, Flux)
- Store Gateway, Route, and Policy manifests in version control
- Implement mandatory PR reviews for all gateway configuration changes
- Use SCM branch protection rules on the main branch
Upgrade Strategy
- Upgrade Envoy Gateway controller first, then verify CRD compatibility
- Test upgrades in a staging environment that mirrors production topology
- Review release notes for breaking changes in CRD schemas or default behavior
- Back up CRD instances before upgrading (
kubectl get -o yaml)
Weekly Installs
1
Repository
missberg/envoy-skillsFirst Seen
5 days ago
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1