ai-gateway

Purpose

This skill manages an AI gateway for routing, securing, and monitoring AI service requests in ML operations, ensuring efficient traffic handling, API security, and performance tracking within the aimlops cluster.

When to Use

Use this skill when building ML pipelines that require centralized routing of AI requests, such as in production environments with multiple AI models, to enforce security policies, monitor traffic, or scale API endpoints. Apply it in scenarios involving microservices for AI inference or when integrating with tools like Kubernetes for aimlops workflows.

Key Capabilities

Routing: Dynamically route requests to AI services based on rules, using path-based or header-based matching.
Security: Enforce authentication, rate limiting, and encryption via JWT or API keys.
Monitoring: Track metrics like request latency and error rates through integrated logging and Prometheus exporters.
Configuration: Support YAML-based configs for defining routes, e.g., specifying source and destination endpoints.
Scalability: Handle load balancing across multiple AI backends with automatic failover.

Usage Patterns

To use this skill, first set up the AI gateway via CLI or API, then define routes and security rules. Always authenticate requests using the $AI_GATEWAY_API_KEY environment variable. For CLI usage, initialize with ai-gateway-cli init --config path/to/config.yaml, then apply changes with ai-gateway-cli apply. In code, import the SDK and call methods like createRoute() for programmatic setup. Monitor usage by querying metrics endpoints periodically.

Common Commands/API

CLI Commands:
- Initialize gateway: ai-gateway-cli init --cluster aimlops --key $AI_GATEWAY_API_KEY
- Add a route: ai-gateway-cli add-route --path /predict --target http://ai-service:8080 --method POST
- Secure an endpoint: ai-gateway-cli secure --endpoint /predict --auth jwt --rate-limit 100/min
- View metrics: ai-gateway-cli metrics --format json
API Endpoints:
- Create route: POST /api/v1/routes with body { "path": "/predict", "target": "http://ai-service:8080", "method": "POST" }
- Update security: PUT /api/v1/security/{endpoint} with body { "authType": "jwt", "rateLimit": 100 }
- Get metrics: GET /api/v1/metrics?type=latency

Code Snippets:

import requests
headers = {'Authorization': f'Bearer {os.environ.get("AI_GATEWAY_API_KEY")}'}
response = requests.post('http://gateway:8080/api/v1/routes', json={"path": "/predict", "target": "http://ai-service:8080"}, headers=headers)

export AI_GATEWAY_API_KEY=your_api_key_here
ai-gateway-cli add-route --path /chat --target http://llm-service:5000

Config Formats: Use YAML for configurations, e.g.:

routes:
  - path: /predict
    target: http://ai-service:8080
    methods: [POST]
security:
  - endpoint: /predict
    auth: jwt
    rateLimit: 100

Integration Notes

Integrate with aimlops by deploying the gateway as a sidecar or standalone service in your cluster. For Kubernetes, add annotations to pods, e.g., kubectl annotate pod ai-pod aimlops/gateway=true. Use the SDK to link with other AI tools: import and initialize with AI_Gateway(api_key=os.environ['AI_GATEWAY_API_KEY']).connect(cluster='aimlops'). Ensure compatibility by matching tags like "ai" and "mlops". For external services, set up webhooks by configuring the gateway's callback URL in your config, e.g., add callback: http://external-service/webhook in YAML.

Error Handling

Handle errors by checking HTTP status codes from API responses; for example, 401 indicates authentication failure, so retry with headers['Authorization'] = f'Bearer {new_key}'. For CLI, parse output errors like "Error: Invalid route path" and correct inputs. Common issues include missing API keys—always verify if not os.environ.get('AI_GATEWAY_API_KEY'): raise ValueError('API key required'). Log errors using the gateway's built-in logger: enable with ai-gateway-cli config --log-level debug, then monitor for patterns like rate limit exceedances and implement retries with exponential backoff in code.

Concrete Usage Examples

Route AI Requests: To route prediction requests to an ML model, first export your API key, then use the CLI: export AI_GATEWAY_API_KEY=abc123; ai-gateway-cli add-route --path /ml-predict --target http://model-service:8000. Verify with a curl request: curl -H "Authorization: Bearer abc123" http://gateway:8080/ml-predict -d '{"input": "data"}'.
Secure and Monitor API: Secure an endpoint and monitor traffic by running: ai-gateway-cli secure --endpoint /chat --auth api-key --rate-limit 50/min. Then, query metrics: ai-gateway-cli metrics --endpoint /chat. In code, handle it as: ```python import requests; headers = {'Authorization': f'Bearer {os.environ["AI_GATEWAY_API_KEY"]}'}; requests.get('http://gateway:8080/api/v1/metrics', headers=headers)

Graph Relationships

Related to: aimlops (cluster), ai (tag), mlops (tag)
Depends on: authentication services for security
Used by: AI services for routing and monitoring
Integrates with: Kubernetes for deployment, Prometheus for metrics collection