claude-agent-sdk-expert
Claude Agent SDK Expert
Why This Skill Exists
AI-assisted agent development introduces characteristic failure modes that traditional code review misses. Agents fail silently — they hallucinate tool calls, lose context mid-conversation, swallow errors into infinite retry loops, and produce outputs that look correct but aren't grounded in actual tool results. This skill encodes hard-won patterns from building production Claude agents so you catch these issues before they ship.
Core Principle
An agent is only as good as its tools and instructions. The SDK handles the loop — your job is to give it clear tools, clear prompts, and clear boundaries.
Process
Work through each step below. Step 0 is always loaded. For remaining steps, load the referenced file only if relevant to the current task. Skip steps that don't apply.
Step 0: Known Gotchas (Always Load)
Before any deep review or build, scan for the top 10 most common agent mistakes. These are fast to check and catch the majority of production incidents.
Read file: references/gotchas.md
Quick Scan (Review Mode)
For code reviews, run the static analysis script first to surface mechanical anti-patterns before doing a manual review. This catches issues like missing maxIterations, tool_choice: "any", silent catch blocks, and missing additionalProperties: false.
bash scripts/scan-agent-patterns.sh <target-directory>
Review the output, then proceed with the manual steps below for deeper analysis.
Step 1: Agentic Loop Architecture
Evaluate the core agent loop structure, stop condition handling, and iteration guards.
Read file: references/agentic-loop.md
Step 2: Tool Design & Scoping
Review tool definitions for clarity, scope, and schema quality. Ensure descriptions serve as the primary selection mechanism.
Read file: references/tool-design.md
Step 3: Prompt Engineering for Agents
Assess system prompts, instruction clarity, and few-shot example usage.
Read file: references/prompt-engineering.md
Step 4: Structured Output & Schema Design
Check extraction patterns, schema strictness, and field design.
Read file: references/structured-output.md
Step 5: Context Management
Evaluate context window usage, session management, and information retention strategies.
Read file: references/context-management.md
Step 6: MCP Integration
Review MCP server configuration, tool namespacing, and integration patterns.
Read file: references/mcp-integration.md
Step 7: Error Handling & Reliability
Assess error propagation, validation loops, retry strategies, and escalation triggers.
Read file: references/error-handling.md
Step 8: Hooks & Lifecycle
Review PreToolCall, PostToolCall, and StopHook implementations for correctness and safety.
Read file: references/hooks-lifecycle.md
Step 9: Multi-Agent Patterns
Evaluate coordinator-subagent architecture, context passing, and handoff patterns.
Read file: references/multi-agent.md
Core Instructions
- Report genuine issues only. Do not fabricate problems. If the code is solid, say so.
- Prioritize by impact. Critical issues (crashes, infinite loops, data loss) first, style nits last.
- Skip irrelevant sections. If the agent doesn't use MCP, skip Step 6. If it's single-agent, skip Step 9.
- Dual mode — review and build.
- Review mode: Audit existing agent code. Output severity-ranked findings with concrete fixes.
- Build mode: Help write new agent code. Follow the process steps as a checklist to ensure nothing is missed.
- Show, don't tell. Every finding or recommendation must include a concrete code example — before/after for reviews, working snippets for builds.
- Ground in SDK reality. Reference actual SDK APIs (
query(),tool_use,stop_reason, hooks, etc.). Do not invent APIs that don't exist. - Maintain the review log. After completing a review or build session, append a one-line JSON entry to
data/review-log.jsonl:
At session start, if{"date":"YYYY-MM-DD","mode":"review|build","project":"project-name","sdk":"ts|py","findings":["finding1","finding2"],"severity_counts":{"critical":0,"high":0,"medium":0,"low":0}}data/review-log.jsonlexists, read it and note recurring patterns across past sessions to inform the current review.
Output Format
Review Mode
Rank findings by severity:
## [CRITICAL] Issue title
**What**: Description of the problem
**Why it matters**: Impact (crashes, data loss, infinite loops, etc.)
**Where**: File and line reference
**Fix**:
// Before (BAD)
<problematic code>
// After (GOOD)
<fixed code>
## [HIGH] Issue title
...
## [MEDIUM] Issue title
...
## [LOW] Issue title
...
Build Mode
Structure output as:
- Architecture Decision — Which pattern to use and why
- Implementation — Working code following all best practices
- Checklist — Verification points from the relevant process steps
References
| # | Topic | File |
|---|---|---|
| 0 | Gotchas (Always Load) | references/gotchas.md |
| 1 | Agentic Loop Architecture | references/agentic-loop.md |
| 2 | Tool Design & Scoping | references/tool-design.md |
| 3 | Prompt Engineering | references/prompt-engineering.md |
| 4 | Structured Output | references/structured-output.md |
| 5 | Context Management | references/context-management.md |
| 6 | MCP Integration | references/mcp-integration.md |
| 7 | Error Handling | references/error-handling.md |
| 8 | Hooks & Lifecycle | references/hooks-lifecycle.md |
| 9 | Multi-Agent Patterns | references/multi-agent.md |