agent-native-architecture
<why_now>
Why Now
Software agents work reliably now. Claude Code demonstrated that an LLM with access to bash and file tools, operating in a loop until an objective is achieved, can accomplish complex multi-step tasks autonomously.
The surprising discovery: a really good coding agent is actually a really good general-purpose agent. The same architecture that lets Claude Code refactor a codebase can let an agent organize your files, manage your reading list, or automate your workflows. </why_now>
<core_principles>
Core Principles
1. Parity
Whatever the user can do through the UI, the agent should be able to achieve through tools.
This is the foundational principle. Without it, nothing else matters. Ensure the agent has tools (or combinations of tools) that can accomplish anything the UI can do. This isn't about 1:1 mapping of UI buttons to tools -- it's about ensuring the agent can achieve the same outcomes.
| User Action | How Agent Achieves It |
|---|---|
| Create a note | write_file to notes directory, or create_note tool |
| Tag a note as urgent | update_file metadata, or tag_note tool |
| Search notes | search_files or search_notes tool |
| Delete a note | delete_file or delete_note tool |
The test: Pick any action a user can take in your UI. Describe it to the agent. Can it accomplish the outcome?
2. Granularity
Prefer atomic primitives. Features are outcomes achieved by an agent operating in a loop.
A tool is a primitive capability: read a file, write a file, run a bash command, store a record, send a notification. A feature is not a function you write. It's an outcome you describe in a prompt, achieved by an agent that has tools and operates in a loop until the outcome is reached.
Less granular (limits the agent):
Tool: classify_and_organize_files(files)
-> You wrote the decision logic
-> To change behavior, you refactor
More granular (empowers the agent):
Tools: read_file, write_file, move_file, list_directory, bash
Prompt: "Organize the user's downloads folder by content and recency."
-> Agent makes the decisions
-> To change behavior, you edit the prompt
The test: To change how a feature behaves, do you edit prose or refactor code?
3. Composability
With atomic tools and parity, you can create new features just by writing new prompts.
This is the payoff of the first two principles. When your tools are atomic and the agent can do anything users can do, new features are just new prompts:
"Review files modified this week. Summarize key changes. Based on
incomplete items and approaching deadlines, suggest three priorities
for next week."
The test: Can you add a new feature by writing a new prompt section, without adding new code?
4. Emergent Capability
The agent can accomplish things you didn't explicitly design for.
When tools are atomic, parity is maintained, and prompts are composable, users will ask the agent for things you never anticipated. And often, the agent can figure it out.
"Cross-reference my meeting notes with my task list and tell me what I've committed to but haven't scheduled."
You didn't build a "commitment tracker" feature. But if the agent can read notes, read tasks, and reason about them -- operating in a loop until it has an answer -- it can accomplish this.
The flywheel:
- Build with atomic tools and parity
- Users ask for things you didn't anticipate
- Agent composes tools to accomplish them (or fails, revealing a gap)
- You observe patterns in what's being requested
- Add domain tools or prompts to make common patterns efficient
- Repeat
The test: Give the agent an open-ended request relevant to your domain. Can it figure out a reasonable approach? If it just says "I don't have a feature for that," your architecture is too constrained.
5. Improvement Over Time
Agent-native applications get better through accumulated context and prompt refinement.
Accumulated context: The agent can maintain state across sessions. A context.md file the agent reads and updates is layer one. More sophisticated approaches involve structured memory and learned preferences.
Prompt refinement at multiple levels:
- Developer level: You ship updated prompts that change agent behavior for all users
- User level: Users customize prompts for their workflow
- Agent level: The agent modifies its own prompts based on feedback (advanced)
The test: Does the application work better after a month of use than on day one, even without code changes? </core_principles>
- Design architecture - Plan a new agent-native system from scratch
- Files & workspace - Use files as the universal interface, shared workspace patterns
- Tool design - Build primitive tools, dynamic capability discovery, CRUD completeness
- Domain tools - Know when to add domain tools vs stay with primitives
- Execution patterns - Completion signals, partial completion, context limits
- System prompts - Define agent behavior in prompts, judgment criteria
- Context injection - Inject runtime app state into agent prompts
- Action parity - Ensure agents can do everything users can do
- Self-modification - Enable agents to safely evolve themselves
- Product design - Progressive disclosure, latent demand, approval patterns
- Mobile patterns - iOS storage, background execution, checkpoint/resume
- Testing - Test agent-native apps for capability and parity
- Refactoring - Make existing code more agent-native
- Anti-patterns - Common mistakes and how to avoid them
- Success criteria - Verify your architecture is agent-native
Wait for response before proceeding.
After reading the reference, apply those patterns to the user's specific context.
<architecture_checklist>
Architecture Review Checklist
When designing an agent-native system, verify these before implementation:
Core Principles
- Parity: Every UI action has a corresponding agent capability
- Granularity: Tools are primitives; features are prompt-defined outcomes
- Composability: New features can be added via prompts alone
- Emergent Capability: Agent can handle open-ended requests in your domain
Tool Design
- Dynamic vs Static: For external APIs where agent should have full access, use Dynamic Capability Discovery
- CRUD Completeness: Every entity has create, read, update, AND delete
- Primitives not Workflows: Tools enable capability, don't encode business logic
- API as Validator: Use
z.string()inputs when the API validates, notz.enum()
Files & Workspace
- Shared Workspace: Agent and user work in same data space
- context.md Pattern: Agent reads/updates context file for accumulated knowledge
- File Organization: Entity-scoped directories with consistent naming
- Context Durability: Incremental progress writes (WAL pattern) so interrupted tasks resume from last checkpoint
Agent Execution
- Completion Signals: Agent has explicit
complete_tasktool (not heuristic detection) - Partial Completion: Multi-step tasks track progress for resume
- Context Limits: Designed for bounded context from the start
- Validate-Before-Run: Agent previews planned actions before executing destructive operations
Context Injection
- Available Resources: System prompt includes what exists (files, data, types)
- Available Capabilities: System prompt documents tools with user vocabulary
- Dynamic Context: Context refreshes for long sessions (or provide
refresh_contexttool)
UI Integration
- Agent -> UI: Agent changes reflect in UI (shared service, file watching, or event bus)
- No Silent Actions: Agent writes trigger UI updates immediately
- Capability Discovery: Users can learn what agent can do
Governance
- Approval Gates: Destructive or irreversible actions require user confirmation
- Audit Trail: Agent actions logged with timestamp, tool, and outcome
- Scope Boundaries: Agent cannot access resources outside its designated workspace
Mobile (if applicable)
- Checkpoint/Resume: Handle iOS app suspension gracefully
- iCloud Storage: iCloud-first with local fallback for multi-device sync
- Cost Awareness: Model tier selection (Haiku/Sonnet/Opus)
When designing architecture, explicitly address each checkbox in your plan. </architecture_checklist>
<quick_start>
Quick Start: Build an Agent-Native Feature
Step 1: Define atomic tools
const tools = [
tool("read_file", "Read any file", { path: z.string() }, ...),
tool("write_file", "Write any file", { path: z.string(), content: z.string() }, ...),
tool("list_files", "List directory", { path: z.string() }, ...),
tool("complete_task", "Signal task completion", { summary: z.string() }, ...),
];
Step 2: Write behavior in the system prompt
## Your Responsibilities
When asked to organize content, you should:
1. Read existing files to understand the structure
2. Analyze what organization makes sense
3. Create/move files using your tools
4. Use your judgment about layout and formatting
5. Call complete_task when you're done
You decide the structure. Make it good.
Step 3: Let the agent work in a loop
const result = await agent.run({
prompt: userMessage,
tools: tools,
systemPrompt: systemPrompt,
// Agent loops until it calls complete_task
});
</quick_start>
<reference_index>
Reference Files
All reference files:
Core Patterns:
- architecture-patterns.md - Event-driven, unified orchestrator, agent-to-UI
- files-universal-interface.md - Why files, organization patterns, context.md
- mcp-tool-design.md - Tool design, dynamic capability discovery, CRUD
- from-primitives-to-domain-tools.md - When to add domain tools, graduating to code
- agent-execution-patterns.md - Completion signals, partial completion, context limits
- system-prompt-design.md - Features as prompts, judgment criteria
Agent-Native Disciplines:
- dynamic-context-injection.md - Runtime context, what to inject
- action-parity-discipline.md - Capability mapping, parity workflow
- shared-workspace-architecture.md - Shared data space, UI integration
- product-implications.md - Progressive disclosure, latent demand, approval
- agent-native-testing.md - Testing outcomes, parity tests
- anti-patterns.md - Common mistakes and what to do instead
- success-criteria.md - Verify your architecture is agent-native
Platform-Specific:
- mobile-patterns.md - iOS storage, checkpoint/resume, cost awareness
- self-modification.md - Git-based evolution, guardrails
- refactoring-to-prompt-native.md - Migrating existing code </reference_index>