AI Architecture Patterns

Installation
SKILL.md

AI Architecture Patterns Skill

Purpose

Provide expert guidance on selecting and implementing enterprise AI architecture patterns for production systems. This skill contains battle-tested patterns from real-world deployments and the AI Architect Academy.

When to Use

  • Designing new AI systems
  • Evaluating architecture options
  • Selecting patterns for specific use cases
  • Understanding tradeoffs between approaches
  • Getting implementation guidance

Core Patterns Library

1. AI Gateway Pattern

Problem: Multiple AI services with inconsistent interfaces, no centralized security, and limited observability create management complexity and security risks.

Solution: Deploy a centralized AI gateway that provides unified authentication, rate limiting, request/response logging, and model routing for all AI services.

Key Components:

  • API Gateway (Kong, AWS API Gateway, OCI API Gateway)
  • Authentication Service (OAuth2, API Keys)
  • Rate Limiter (Redis-based)
  • Request Logger (OpenTelemetry)
  • Model Router

When to Use:

  • Multiple AI providers in your stack
  • Need centralized security controls
  • Want unified logging and monitoring
  • Cost allocation across teams

When NOT to Use:

  • Single AI provider with simple use case
  • Ultra-low latency requirements (<10ms)
  • Early prototyping

2. RAG Production Pattern

Problem: LLMs hallucinate and lack access to enterprise-specific knowledge, making them unreliable for business-critical applications.

Solution: Implement a RAG pipeline with document ingestion, chunking, embedding, vector storage, retrieval, and augmented generation with source citations.

Key Components:

  • Document Ingestion Pipeline
  • Text Chunking Service (semantic, fixed, hybrid)
  • Embedding Model (OpenAI, Cohere, Local)
  • Vector Database (Pinecone, Weaviate, pgvector)
  • Retrieval Service with Reranking
  • LLM with RAG prompt template

When to Use:

  • Customer support knowledge base
  • Internal document Q&A
  • Legal/compliance document analysis
  • Technical documentation assistants

When NOT to Use:

  • General creative writing
  • Real-time frequently changing data
  • Very small document corpus (<100 docs)

Implementation Tips:

  • Start with fixed-size chunks (512-1024 tokens)
  • Add metadata extraction for filtering
  • Implement hybrid search (keyword + semantic)
  • Use reranking for improved precision

3. Multi-Agent Orchestration Pattern

Problem: Complex tasks require multiple specialized capabilities that exceed what a single LLM prompt can handle reliably.

Solution: Decompose complex workflows into specialized agents with an orchestrator that coordinates task distribution, handoffs, and result aggregation.

Key Components:

  • Orchestrator Agent (workflow coordinator)
  • Specialized Worker Agents (domain experts)
  • Task Queue (for async processing)
  • State Management (context preservation)
  • Handoff Protocol (agent-to-agent communication)
  • Result Aggregator

When to Use:

  • Complex workflows with 5+ distinct steps
  • Tasks requiring different expertise
  • Autonomous systems
  • Workflows with branching logic

When NOT to Use:

  • Simple single-step tasks
  • When cost is primary constraint
  • High-volume, low-complexity operations

Frameworks:

  • LangGraph (graph-based orchestration)
  • Claude Agent SDK
  • AutoGen / CrewAI

4. MCP Server Architecture

Problem: N agents x M tools = N*M integrations. Each AI agent needs custom code to integrate with each tool.

Solution: Implement MCP (Model Context Protocol) servers that provide standardized interfaces for tools, resources, and prompts.

Key Components:

  • MCP Server (Node.js or Python)
  • Tool Definitions (JSON Schema)
  • Resource Providers
  • Prompt Templates
  • Transport Layer (stdio, SSE)

When to Use:

  • Building tools for multiple AI agents
  • Creating reusable integrations
  • Claude Code environments
  • Enterprise tool standardization

Implementation:

import { Server } from '@modelcontextprotocol/sdk/server';

const server = new Server({
  name: 'my-mcp-server',
  version: '1.0.0'
});

server.tool('search', {
  description: 'Search documents',
  inputSchema: {
    type: 'object',
    properties: {
      query: { type: 'string' }
    }
  },
  handler: async ({ query }) => {
    // Implementation
  }
});

5. LLMOps Pipeline Pattern

Problem: LLM applications lack mature DevOps practices, leading to unpredictable quality and difficult rollbacks.

Solution: Implement prompt versioning, automated evaluation, staged deployments, and continuous monitoring.

Key Components:

  • Prompt Version Control (Git, Promptfoo)
  • Evaluation Dataset
  • Automated Eval Pipeline
  • Deployment Orchestrator
  • Monitoring Dashboard
  • Rollback Mechanism

When to Use:

  • Production LLM applications
  • Teams with multiple prompt engineers
  • Regulated industries
  • High-stakes AI applications

Evaluation Metrics:

  • Accuracy (vs golden answers)
  • Latency (p50, p95, p99)
  • Cost per request
  • User satisfaction scores

6. Vector Database Selection Framework

Problem: Many vector database options with different tradeoffs. Wrong choice leads to expensive migrations.

Solution: Structured decision framework evaluating scale, features, operations, and cost.

Selection Matrix:

Scale Recommendation
<1M vectors pgvector (simple), Chroma (prototyping)
1-100M vectors Weaviate, Qdrant (self-hosted)
100M+ vectors Pinecone, Milvus (managed)

Key Considerations:

  • Hybrid search support
  • Metadata filtering
  • Multi-tenancy
  • Backup/restore
  • Managed vs self-hosted

7. AI Center of Excellence Framework

Problem: Scattered AI initiatives across organization lead to duplicated effort, inconsistent quality, and security gaps.

Solution: Establish centralized governance with standardized patterns, reusable components, and shared infrastructure.

Key Components:

  • Pattern Library (this skill!)
  • Governance Framework
  • Shared Infrastructure
  • Training Program
  • Review Board
  • Metrics Dashboard

Governance Areas:

  • Model selection criteria
  • Security standards
  • Cost controls
  • Ethical guidelines
  • Incident response

8. Security & Governance Pattern

Problem: AI introduces new security vectors: prompt injection, data leakage, model manipulation.

Solution: Implement AI-specific security controls including guardrails, PII handling, and audit logging.

Key Controls:

  • Input Guardrails (prompt injection detection)
  • Output Guardrails (content filtering)
  • PII Detection & Redaction
  • Audit Logging
  • Access Control
  • Compliance Reporting

Guardrails Implementation:

from guardrails import Guard

guard = Guard.from_pydantic(output_class=SafeResponse)

response = guard(
    llm.invoke,
    prompt=user_input,
    on_fail="reask"
)

Pattern Selection Decision Tree

START: What type of AI system?
├── Document/Knowledge Q&A
│   └── → RAG Production Pattern
│       ├── Need multiple models? → + AI Gateway
│       └── Sensitive data? → + Security & Governance
├── Autonomous Agents
│   └── → Multi-Agent Orchestration
│       ├── Many tools? → + MCP Servers
│       └── Production deployment? → + LLMOps
├── Enterprise AI Platform
│   └── → AI Gateway + AI CoE Framework
│       ├── Cost concerns? → + Cost Optimization
│       └── Compliance? → + Security & Governance
└── Content Generation
    └── → AI Gateway + LLMOps
        └── Quality critical? → + Evaluation Pipeline

Pattern Combinations Matrix

Use Case Primary Secondary Tertiary
Customer Support Bot RAG AI Gateway Security
Code Assistant Multi-Agent MCP Servers LLMOps
Document Intelligence RAG Vector DB AI Gateway
Enterprise AI Platform AI Gateway AI CoE Security
Research Assistant RAG Multi-Agent LLMOps

Cloud Provider Mapping

AWS

  • AI Gateway: API Gateway + Lambda
  • RAG: Bedrock + OpenSearch
  • Vector DB: OpenSearch, Aurora pgvector

GCP

  • AI Gateway: Cloud Endpoints + Cloud Functions
  • RAG: Vertex AI + Matching Engine
  • Vector DB: Matching Engine, AlloyDB

Azure

  • AI Gateway: API Management + Functions
  • RAG: Azure OpenAI + AI Search
  • Vector DB: AI Search, Cosmos DB

OCI

  • AI Gateway: API Gateway + Functions
  • RAG: OCI GenAI + OpenSearch
  • Vector DB: OpenSearch, PostgreSQL

Resources

Related Skills

  • mcp-architecture - MCP server development
  • claude-sdk - Agent development with Claude
  • langgraph-patterns - Graph-based agent workflows
  • oci-services-expert - Oracle Cloud guidance

Part of the AI Architect Academy by FrankX.AI

Related skills
Installs
First Seen