AI-Native System Development & Deployment Skill

This skill provides a structured workflow for guiding users through the complete development lifecycle of AI-Native systems—from architecture design through Kubernetes deployment—via interactive Q&A. It ensures consistent decision-making at every stage.

When to Trigger This Skill

Trigger conditions:

User mentions AI-Native development: "build an AI app", "create AI-powered system", "develop with AI agents"
User mentions agent design: "design AI agents", "plan agent architecture", "agent workflows"
User mentions AI system architecture: "AI microservices", "LLM integration", "agentic system"
User wants end-to-end planning: "plan my AI app from design to deployment"
User asks about AI patterns: "how should I structure my AI app", "agent communication patterns"

Design Principles (Apply Consistently)

Principle	Decision	Rationale
Agent Autonomy	Single-responsibility agents	Each agent has one clear purpose; easier to test and scale
LLM Abstraction	Provider-agnostic interface	Swap LLM providers without code changes
Communication	Async-first, sync when needed	AI operations are inherently slow; async prevents blocking
State Management	Stateless agents, external state	Agents don't hold state; enables horizontal scaling
Error Handling	Graceful degradation	AI failures shouldn't crash the system
Observability	Trace every LLM call	Debug AI behavior with full request/response logging
Security	Secrets in vault, never in code	API keys for LLMs are high-value targets
K8s Services	ClusterIP for internal traffic	No external exposure for service-to-service
K8s RBAC	Namespace-scoped, least privilege	Minimize blast-radius

Workflow Overview (8 Stages)

┌─────────────────────────────────────────────────────────────────────────┐
│                    AI-NATIVE SYSTEM DEVELOPMENT                         │
├─────────────────────────────────────────────────────────────────────────┤
│  PHASE 1: SYSTEM DESIGN                                                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                  │
│  │ Stage 1:     │  │ Stage 2:     │  │ Stage 3:     │                  │
│  │ System       │─►│ AI Agent     │─►│ API &        │                  │
│  │ Discovery    │  │ Design       │  │ Endpoints    │                  │
│  └──────────────┘  └──────────────┘  └──────────────┘                  │
├─────────────────────────────────────────────────────────────────────────┤
│  PHASE 2: TECHNOLOGY & INTEGRATION                                      │
│  ┌──────────────┐  ┌──────────────┐                                    │
│  │ Stage 4:     │  │ Stage 5:     │                                    │
│  │ Tech Stack   │─►│ Integration  │                                    │
│  │ Selection    │  │ Patterns     │                                    │
│  └──────────────┘  └──────────────┘                                    │
├─────────────────────────────────────────────────────────────────────────┤
│  PHASE 3: KUBERNETES DEPLOYMENT                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐                  │
│  │ Stage 6:     │  │ Stage 7:     │  │ Stage 8:     │                  │
│  │ Manifest     │─►│ Config &     │─►│ Networking   │                  │
│  │ Planning     │  │ Secrets      │  │ & RBAC       │                  │
│  └──────────────┘  └──────────────┘  └──────────────┘                  │
└─────────────────────────────────────────────────────────────────────────┘

PHASE 1: SYSTEM DESIGN

Stage 1: System Discovery

Goal

Understand the overall system purpose, components, and user interactions.

Questions to Ask

System Purpose
- What problem does this AI-Native system solve?
- Who are the primary users?
Component Identification
- How many applications/services will the system have?
- Which components are AI-powered vs traditional?
User Interaction Model
- How do users interact with the system? (Web UI, API, Chat, CLI)
- Is there real-time interaction (chat) or batch processing?
External Dependencies
- Which external services are required? (LLM providers, databases, third-party APIs)
- Are there existing systems to integrate with?
Data Flow
- What data flows into the system?
- What outputs does the system produce?

Output Table: System Overview

| Aspect | Description |
|--------|-------------|
| System Name | |
| Purpose | |
| Primary Users | |
| Interaction Model | |
| Component Count | |
| AI-Powered Components | |
| External Dependencies | |

Output Table: Component Registry

| Component | Type | AI-Powered | Description |
|-----------|------|------------|-------------|
| | Frontend / Backend / Agent / Service | Yes/No | |

Stage 2: AI Agent Design

Goal

Design AI agents with clear responsibilities, capabilities, and interaction patterns.

Agent Classification Framework

Agent Type	Characteristics	Use Case
Conversational Agent	Handles natural language dialogue	Chat interfaces, Q&A systems
Task Agent	Executes specific tasks autonomously	Automation, workflow execution
Orchestrator Agent	Coordinates multiple agents	Complex multi-step processes
Retrieval Agent	Fetches and synthesizes information	RAG systems, knowledge bases
Tool-Using Agent	Calls external tools/APIs	Function calling, integrations

Questions to Ask

For each AI agent identified:

Agent Identity
- What is the agent's name and primary responsibility?
- What type of agent is it? (Conversational, Task, Orchestrator, Retrieval, Tool-Using)
LLM Requirements
- Which LLM provider will it use? (OpenAI, Anthropic, Azure OpenAI, self-hosted)
- What model capabilities are needed? (chat, function calling, vision, embeddings)
- What are the latency requirements? (real-time < 2s, near-real-time < 10s, batch)
Agent Capabilities
- What actions can this agent perform?
- What tools/functions does it have access to?
- What are its input/output formats?
Agent Boundaries
- What should this agent NOT do?
- What are its failure modes?
- How should it handle uncertainty?
Agent Communication
- Does it communicate with other agents?
- Does it call backend services?
- Does it interact directly with users?

Output Table: Agent Registry

| Agent Name | Type | LLM Provider | Model | Latency Req | Description |
|------------|------|--------------|-------|-------------|-------------|

Output Table: Agent Capabilities

| Agent Name | Capabilities (Actions) | Tools/Functions | Input Format | Output Format |
|------------|----------------------|-----------------|--------------|---------------|

Output Table: Agent Communication Matrix

| Agent | Communicates With | Communication Type | Purpose |
|-------|-------------------|-------------------|---------|
| | User / Agent / Service | Sync / Async / Stream | |

Agent Design Principles (Apply Consistently)

Principle	Implementation
Single Responsibility	Each agent has ONE primary purpose
Stateless Design	No in-memory state; use external storage
Graceful Degradation	Return meaningful errors, never crash
Timeout Handling	All LLM calls have timeouts (default: 30s)
Retry Logic	Exponential backoff for transient failures
Token Budgeting	Set max_tokens to control costs
Prompt Versioning	Version control all system prompts

Stage 3: API & Endpoint Design

Goal

Define all APIs and endpoints for both traditional services and AI agents.

API Design Framework

Component Type	Endpoint Pattern	Example
Frontend	Static + Proxy	`/`, `/assets/`, `/api/` (proxy)
Backend REST	Resource-based	`/api/v1/tasks`, `/api/v1/tasks/{id}`
AI Agent (Sync)	Action-based	`/agent/chat`, `/agent/analyze`
AI Agent (Stream)	SSE endpoint	`/agent/chat/stream`
Notifications	WebSocket/SSE	`/notifications/subscribe`
Health Checks	Standard paths	`/health`, `/ready`, `/live`

Questions to Ask

For each component:

Endpoint Inventory
- What endpoints does this component expose?
- What HTTP methods are used?
- What is each endpoint's purpose?
Request/Response
- What are the request payload structures?
- What are the response formats?
- Are there streaming responses?
Authentication
- Which endpoints require authentication?
- What auth mechanism? (JWT, API Key, OAuth)
Rate Limiting
- Should any endpoints be rate-limited?
- What are the limits? (requests/minute, tokens/day)

Output Table: Endpoint Registry

| Component | Endpoint | Method | Description | Auth Required |
|-----------|----------|--------|-------------|---------------|

Output Table: AI Agent Endpoints (Detailed)

| Agent | Endpoint | Method | Input | Output | Streaming | Timeout |
|-------|----------|--------|-------|--------|-----------|---------|

PHASE 2: TECHNOLOGY & INTEGRATION

Stage 4: Technology Stack Selection

Goal

Select appropriate technologies with consistent decision-making rationale.

Technology Decision Framework

Layer	Options	Decision Factors
Frontend	React, Vue, Next.js, Static	Interactivity needs, SEO, team expertise
Backend	Node.js, Python, Go, Java	Performance, ecosystem, team expertise
AI/Agent Framework	LangChain, LlamaIndex, Custom, Semantic Kernel	Complexity, flexibility, vendor lock-in
LLM Provider	OpenAI, Anthropic, Azure OpenAI, Local	Cost, latency, compliance, capabilities
Message Queue	None, Redis, RabbitMQ, Kafka	Scale, ordering needs, persistence
Database	None, PostgreSQL, MongoDB, Redis	Data model, query patterns, scale
Cache	None, Redis, Memcached	Read patterns, invalidation needs

Questions to Ask

Language & Runtime
- What programming language(s) for backend services?
- What framework for AI agents?
- Frontend technology?
AI Infrastructure
- LLM provider selection?
- Need for embeddings/vector database?
- Agent orchestration framework?
Data Layer
- Is a database needed? Which type?
- Caching requirements?
- Message queue for async processing?
Constraints
- Team expertise?
- Compliance requirements?
- Budget constraints?

Output Table: Technology Stack

| Layer | Technology | Justification |
|-------|------------|---------------|
| Frontend | | |
| Backend Services | | |
| AI Agent Framework | | |
| LLM Provider | | |
| Database | | |
| Cache | | |
| Message Queue | | |

Stage 5: Integration Patterns

Goal

Define how components communicate and integrate.

Communication Pattern Framework

Pattern	When to Use	Implementation
Sync REST	Simple request-response	HTTP client with timeout
Async Queue	Fire-and-forget, long processing	Message queue (Redis, RabbitMQ)
Streaming (SSE)	Real-time AI responses	Server-Sent Events
WebSocket	Bidirectional real-time	WS connection with heartbeat
Event-Driven	Loose coupling, multiple consumers	Pub/Sub pattern

AI-Specific Integration Patterns

Pattern	Description	Use Case
Request-Response	Sync call to LLM, wait for response	Simple chat, single-turn
Streaming Response	Token-by-token streaming	Chat UX, long responses
Tool Calling	LLM calls functions, agent executes	Actions, integrations
Agent Chaining	Output of one agent feeds another	Complex workflows
Human-in-the-Loop	Agent requests human approval	High-stakes decisions

Questions to Ask

Service Communication
- How do services communicate? (REST, gRPC, Queue)
- Sync or async?
- Error handling strategy?
AI Integration
- How does UI communicate with AI agents? (REST, Stream, WebSocket)
- How do agents communicate with backend?
- How do agents call external tools?
Event Handling
- Are there events that trigger actions?
- How are notifications delivered?

Output Table: Integration Matrix

| From | To | Pattern | Protocol | Purpose |
|------|-----|---------|----------|---------|

Output Table: Communication Flow

| Flow Name | Steps | Pattern | Description |
|-----------|-------|---------|-------------|
| User Chat | UI → Agent → LLM → Agent → UI | Stream | User sends message, gets streamed response |

PHASE 3: KUBERNETES DEPLOYMENT

Stage 6: Manifest Planning

Goal

Determine required Kubernetes resources based on system design.

Manifest Selection Framework

Component Characteristic	Manifest Type	Rationale
Stateless service	Deployment	Rolling updates, scaling
Stateful service (DB)	StatefulSet	Stable identity, ordered
Background processor	Deployment + HPA	Scale based on queue depth
Scheduled job	CronJob	Periodic execution
Per-node requirement	DaemonSet	Logging, monitoring agents

Questions to Ask

Workload Types
- Confirm workload type for each component
- Scaling requirements? (replicas, HPA)
Namespace Strategy
- Single namespace or multiple?
- Environment separation? (dev, staging, prod)
Resource Requirements
- CPU/Memory requests and limits?
- GPU requirements for local LLMs?

Output Table: Manifest Inventory

| Category | Manifest Type | Count | Names |
|----------|--------------|-------|-------|
| Workloads | Deployment | | |
| Networking | Service | | |
| Networking | Ingress | | |
| Networking | NetworkPolicy | | |
| Config | ConfigMap | | |
| Config | Secret | | |
| RBAC | ServiceAccount | | |
| RBAC | Role | | |
| RBAC | RoleBinding | | |

Output Table: Deployment Details

| Deployment Name | Replicas | Purpose |
|-----------------|----------|---------|

Stage 7: Configuration & Secrets

Goal

Design ConfigMaps and Secrets with security best practices.

Configuration Classification

Data Type	Resource	Security Level
Service URLs	ConfigMap	Public
Feature flags	ConfigMap	Public
Log levels	ConfigMap	Public
LLM API keys	Secret	Critical
Database credentials	Secret	Critical
JWT signing keys	Secret	Critical
TLS certificates	Secret (TLS type)	High

AI-Native Specific Configuration

Config Item	Resource	Description
`LLM_API_KEY`	Secret	LLM provider API key
`LLM_ORG_ID`	Secret	LLM organization ID
`LLM_MODEL_NAME`	ConfigMap	Model identifier (not sensitive)
`LLM_TIMEOUT_MS`	ConfigMap	Request timeout
`LLM_MAX_TOKENS`	ConfigMap	Token limit per request
`LLM_TEMPERATURE`	ConfigMap	Model temperature setting
`AGENT_SYSTEM_PROMPT`	ConfigMap	Agent's system prompt (version controlled)

Questions to Ask

Per-Component Config
- What non-sensitive config does each component need?
- What environment variables?
Secrets Inventory
- What secrets are required?
- LLM API keys? Database credentials? JWT keys?
Security Approach
- Encryption at rest for secrets?
- External secret manager integration? (Vault, AWS Secrets Manager)

Output Table: ConfigMaps

| ConfigMap Name | Environment Variables | Justification |
|----------------|----------------------|---------------|

Output Table: Secrets

| Secret Name | Data Type | Keys | Security Approach | Justification |
|-------------|-----------|------|-------------------|---------------|

Stage 8: Networking & RBAC

Goal

Design secure networking and minimal-privilege access control.

Service Type Decision (Consistency Rule)

Scenario	Service Type	Rationale
Internal communication	ClusterIP	Default; no external exposure
External access	ClusterIP + Ingress	Centralized TLS, routing
Development/testing	NodePort	Simple but not for production

Consistency Rule: Default to ClusterIP for ALL services. Use single Ingress for external access.

Network Policy Strategy

Policy Type	Purpose
Default Deny	Block all traffic by default
Allow Ingress	Whitelist specific ingress sources
Allow Egress	Whitelist specific egress destinations
LLM Egress	Allow agents to reach external LLM APIs

RBAC Blast-Radius Principle

Scope	When to Use
Namespace (Role)	Default; covers 99% of cases
Cluster (ClusterRole)	Only for cluster-wide resources

Questions to Ask

Service Exposure
- Which components need external access?
- TLS requirements?
Network Policies
- Should traffic between services be restricted?
- External egress requirements? (LLM APIs)
RBAC Needs
- Do components need K8s API access?
- ConfigMap/Secret read access?

Output Table: Services

| Service Name | Type | Justification |
|--------------|------|---------------|

Output Table: Ingress Routes

| Host | Path | Backend Service | Description |
|------|------|-----------------|-------------|

Output Table: Network Policies

| Policy Name | Pod Selector | Ingress From | Egress To | Justification |
|-------------|--------------|--------------|-----------|---------------|

Output Table: RBAC Artifacts

| Type | Name | Scope | Justification |
|------|------|-------|---------------|

Final Output: Development Plan Document

After completing all 8 stages, generate a comprehensive document with:

Document Structure

# [System Name] - Development & Deployment Plan

## 1. Design Principles
[Consistency guidelines table]

## 2. System Architecture
### 2.1 System Overview
### 2.2 Component Registry

## 3. AI Agent Design
### 3.1 Agent Registry
### 3.2 Agent Capabilities
### 3.3 Agent Communication Matrix

## 4. API & Endpoints
### 4.1 Endpoint Registry
### 4.2 AI Agent Endpoints

## 5. Technology Stack
[Technology decisions with justifications]

## 6. Integration Patterns
### 6.1 Communication Matrix
### 6.2 Data Flow Diagrams

## 7. Kubernetes Deployment
### 7.1 Manifest Inventory
### 7.2 Deployments
### 7.3 ConfigMaps
### 7.4 Secrets
### 7.5 Services
### 7.6 Ingress
### 7.7 Network Policies
### 7.8 RBAC

## 8. Architecture Diagram
[ASCII/Text diagram]

Consistency Checklist

Before finalizing, verify these consistency rules are applied:

Category	Check
Agents	Each agent has single responsibility?
Agents	All agents are stateless?
Agents	LLM calls have timeouts?
APIs	Health endpoints on all components?
APIs	Consistent naming convention?
Services	All internal services use ClusterIP?
Services	Single Ingress for external access?
Secrets	No secrets in ConfigMaps?
Secrets	LLM API keys in Secrets?
RBAC	Namespace-scoped (not cluster)?
RBAC	Each component has own ServiceAccount?
Network	Default deny with explicit allows?
Network	LLM egress explicitly allowed for agents?

Example Interaction

User: I want to build an AI-powered task manager with a chat interface.

Assistant: I'll help you design and plan your AI-Native task manager. Let's go through the 8 stages systematically.

Stage 1: System Discovery

Let me understand your system:

What's the primary purpose? (e.g., personal productivity, team collaboration)
How will users interact? (Web app, mobile, CLI)
What AI capabilities do you need? (Natural language task creation, prioritization, summaries)
Any external integrations? (Calendar, email, Slack)

[User responds, continue through all 8 stages...]

Related Skills

This skill extends and incorporates:

k8s-planning - Kubernetes deployment planning (Stages 6-8)

ai-native-dev

AI-Native System Development & Deployment Skill

When to Trigger This Skill

Design Principles (Apply Consistently)

Workflow Overview (8 Stages)

PHASE 1: SYSTEM DESIGN

Stage 1: System Discovery

Goal

Questions to Ask

Output Table: System Overview

Output Table: Component Registry

Stage 2: AI Agent Design

Goal

Agent Classification Framework

Questions to Ask

Output Table: Agent Registry

Output Table: Agent Capabilities

Output Table: Agent Communication Matrix

Agent Design Principles (Apply Consistently)

Stage 3: API & Endpoint Design

Goal

API Design Framework

Questions to Ask

Output Table: Endpoint Registry

Output Table: AI Agent Endpoints (Detailed)

PHASE 2: TECHNOLOGY & INTEGRATION

Stage 4: Technology Stack Selection

Goal

Technology Decision Framework

Questions to Ask

Output Table: Technology Stack

Stage 5: Integration Patterns

Goal

Communication Pattern Framework

AI-Specific Integration Patterns

Questions to Ask

Output Table: Integration Matrix

Output Table: Communication Flow

PHASE 3: KUBERNETES DEPLOYMENT

Stage 6: Manifest Planning

Goal

Manifest Selection Framework

Questions to Ask

Output Table: Manifest Inventory

Output Table: Deployment Details

Stage 7: Configuration & Secrets

Goal

Configuration Classification

AI-Native Specific Configuration

Questions to Ask

Output Table: ConfigMaps

Output Table: Secrets

Stage 8: Networking & RBAC

Goal

Service Type Decision (Consistency Rule)

Network Policy Strategy

RBAC Blast-Radius Principle

Questions to Ask

Output Table: Services

Output Table: Ingress Routes

Output Table: Network Policies

Output Table: RBAC Artifacts

Final Output: Development Plan Document

Document Structure

Consistency Checklist

Example Interaction

Related Skills

More from alijilani-dev/claude

github-assistant

sqlmodel-orm-dbhelper

fastapi_pytest_tddhelper

fastapi-helper

docker-rocker

pytest-python