AI Architecture Patterns Skill

Purpose

Provide expert guidance on selecting and implementing enterprise AI architecture patterns for production systems. This skill contains battle-tested patterns from real-world deployments and the AI Architect Academy.

When to Use

Designing new AI systems
Evaluating architecture options
Selecting patterns for specific use cases
Understanding tradeoffs between approaches
Getting implementation guidance

Core Patterns Library

1. AI Gateway Pattern

Problem: Multiple AI services with inconsistent interfaces, no centralized security, and limited observability create management complexity and security risks.

Solution: Deploy a centralized AI gateway that provides unified authentication, rate limiting, request/response logging, and model routing for all AI services.

Key Components:

API Gateway (Kong, AWS API Gateway, OCI API Gateway)
Authentication Service (OAuth2, API Keys)
Rate Limiter (Redis-based)
Request Logger (OpenTelemetry)
Model Router

When to Use:

Multiple AI providers in your stack
Need centralized security controls
Want unified logging and monitoring
Cost allocation across teams

When NOT to Use:

Single AI provider with simple use case
Ultra-low latency requirements (<10ms)
Early prototyping

2. RAG Production Pattern

Problem: LLMs hallucinate and lack access to enterprise-specific knowledge, making them unreliable for business-critical applications.

Solution: Implement a RAG pipeline with document ingestion, chunking, embedding, vector storage, retrieval, and augmented generation with source citations.

Key Components:

Document Ingestion Pipeline
Text Chunking Service (semantic, fixed, hybrid)
Embedding Model (OpenAI, Cohere, Local)
Vector Database (Pinecone, Weaviate, pgvector)
Retrieval Service with Reranking
LLM with RAG prompt template

When to Use:

Customer support knowledge base
Internal document Q&A
Legal/compliance document analysis
Technical documentation assistants

When NOT to Use:

General creative writing
Real-time frequently changing data
Very small document corpus (<100 docs)

Implementation Tips:

Start with fixed-size chunks (512-1024 tokens)
Add metadata extraction for filtering
Implement hybrid search (keyword + semantic)
Use reranking for improved precision

3. Multi-Agent Orchestration Pattern

Problem: Complex tasks require multiple specialized capabilities that exceed what a single LLM prompt can handle reliably.

Solution: Decompose complex workflows into specialized agents with an orchestrator that coordinates task distribution, handoffs, and result aggregation.

Key Components:

Orchestrator Agent (workflow coordinator)
Specialized Worker Agents (domain experts)
Task Queue (for async processing)
State Management (context preservation)
Handoff Protocol (agent-to-agent communication)
Result Aggregator

When to Use:

Complex workflows with 5+ distinct steps
Tasks requiring different expertise
Autonomous systems
Workflows with branching logic

When NOT to Use:

Simple single-step tasks
When cost is primary constraint
High-volume, low-complexity operations

Frameworks:

LangGraph (graph-based orchestration)
Claude Agent SDK
AutoGen / CrewAI

4. MCP Server Architecture

Problem: N agents x M tools = N*M integrations. Each AI agent needs custom code to integrate with each tool.

Solution: Implement MCP (Model Context Protocol) servers that provide standardized interfaces for tools, resources, and prompts.

Key Components:

MCP Server (Node.js or Python)
Tool Definitions (JSON Schema)
Resource Providers
Prompt Templates
Transport Layer (stdio, SSE)

When to Use:

Building tools for multiple AI agents
Creating reusable integrations
Claude Code environments
Enterprise tool standardization

Implementation:

import { Server } from '@modelcontextprotocol/sdk/server';

const server = new Server({
  name: 'my-mcp-server',
  version: '1.0.0'
});

server.tool('search', {
  description: 'Search documents',
  inputSchema: {
    type: 'object',
    properties: {
      query: { type: 'string' }
    }
  },
  handler: async ({ query }) => {
    // Implementation
  }
});

5. LLMOps Pipeline Pattern

Problem: LLM applications lack mature DevOps practices, leading to unpredictable quality and difficult rollbacks.

Solution: Implement prompt versioning, automated evaluation, staged deployments, and continuous monitoring.

Key Components:

Prompt Version Control (Git, Promptfoo)
Evaluation Dataset
Automated Eval Pipeline
Deployment Orchestrator
Monitoring Dashboard
Rollback Mechanism

When to Use:

Production LLM applications
Teams with multiple prompt engineers
Regulated industries
High-stakes AI applications

Evaluation Metrics:

Accuracy (vs golden answers)
Latency (p50, p95, p99)
Cost per request
User satisfaction scores

6. Vector Database Selection Framework

Problem: Many vector database options with different tradeoffs. Wrong choice leads to expensive migrations.

Solution: Structured decision framework evaluating scale, features, operations, and cost.

Selection Matrix:

Scale	Recommendation
<1M vectors	pgvector (simple), Chroma (prototyping)
1-100M vectors	Weaviate, Qdrant (self-hosted)
100M+ vectors	Pinecone, Milvus (managed)

Key Considerations:

Hybrid search support
Metadata filtering
Multi-tenancy
Backup/restore
Managed vs self-hosted

7. AI Center of Excellence Framework

Problem: Scattered AI initiatives across organization lead to duplicated effort, inconsistent quality, and security gaps.

Solution: Establish centralized governance with standardized patterns, reusable components, and shared infrastructure.

Key Components:

Pattern Library (this skill!)
Governance Framework
Shared Infrastructure
Training Program
Review Board
Metrics Dashboard

Governance Areas:

Model selection criteria
Security standards
Cost controls
Ethical guidelines
Incident response

8. Security & Governance Pattern

Problem: AI introduces new security vectors: prompt injection, data leakage, model manipulation.

Solution: Implement AI-specific security controls including guardrails, PII handling, and audit logging.

Key Controls:

Input Guardrails (prompt injection detection)
Output Guardrails (content filtering)
PII Detection & Redaction
Audit Logging
Access Control
Compliance Reporting

Guardrails Implementation:

from guardrails import Guard

guard = Guard.from_pydantic(output_class=SafeResponse)

response = guard(
    llm.invoke,
    prompt=user_input,
    on_fail="reask"
)

Pattern Selection Decision Tree

START: What type of AI system?
│
├── Document/Knowledge Q&A
│   └── → RAG Production Pattern
│       ├── Need multiple models? → + AI Gateway
│       └── Sensitive data? → + Security & Governance
│
├── Autonomous Agents
│   └── → Multi-Agent Orchestration
│       ├── Many tools? → + MCP Servers
│       └── Production deployment? → + LLMOps
│
├── Enterprise AI Platform
│   └── → AI Gateway + AI CoE Framework
│       ├── Cost concerns? → + Cost Optimization
│       └── Compliance? → + Security & Governance
│
└── Content Generation
    └── → AI Gateway + LLMOps
        └── Quality critical? → + Evaluation Pipeline

Pattern Combinations Matrix

Use Case	Primary	Secondary	Tertiary
Customer Support Bot	RAG	AI Gateway	Security
Code Assistant	Multi-Agent	MCP Servers	LLMOps
Document Intelligence	RAG	Vector DB	AI Gateway
Enterprise AI Platform	AI Gateway	AI CoE	Security
Research Assistant	RAG	Multi-Agent	LLMOps

Cloud Provider Mapping

AWS

AI Gateway: API Gateway + Lambda
RAG: Bedrock + OpenSearch
Vector DB: OpenSearch, Aurora pgvector

GCP

AI Gateway: Cloud Endpoints + Cloud Functions
RAG: Vertex AI + Matching Engine
Vector DB: Matching Engine, AlloyDB

Azure

AI Gateway: API Management + Functions
RAG: Azure OpenAI + AI Search
Vector DB: AI Search, Cosmos DB

OCI

AI Gateway: API Gateway + Functions
RAG: OCI GenAI + OpenSearch
Vector DB: OpenSearch, PostgreSQL

Resources

GitHub: https://github.com/frankxai/ai-architect-academy
Patterns Library: /01-design-patterns
Learning Paths: /02-learning-paths
Templates: /AI CoE Templates

Related Skills

mcp-architecture - MCP server development
claude-sdk - Agent development with Claude
langgraph-patterns - Graph-based agent workflows
oci-services-expert - Oracle Cloud guidance

Part of the AI Architect Academy by FrankX.AI

AI Architecture Patterns

AI Architecture Patterns Skill

Purpose

When to Use

Core Patterns Library

1. AI Gateway Pattern

2. RAG Production Pattern

3. Multi-Agent Orchestration Pattern

4. MCP Server Architecture

5. LLMOps Pipeline Pattern

6. Vector Database Selection Framework

7. AI Center of Excellence Framework

8. Security & Governance Pattern

Pattern Selection Decision Tree

Pattern Combinations Matrix

Cloud Provider Mapping

AWS

GCP

Azure

OCI

Resources

Related Skills

More from frankxai/ai-and-web3

cacos

test-driven development

spartan warrior

framer expert

skill creator

golden path