agent-first-repo
Agent-First Repository Design
Patterns for structuring a repository so AI coding agents can do effective, autonomous work. Based on OpenAI's harness engineering approach: https://openai.com/index/harness-engineering/
The core insight: if the agent can't see it in the repo, it doesn't exist. Knowledge in Slack, Google Docs, or people's heads is invisible to agents. The repository must be the single source of truth.
The Knowledge Hierarchy
A well-structured agent-first repo follows a layered documentation architecture:
AGENTS.md ~100 lines — table of contents, dev commands
ARCHITECTURE.md codemap with boundaries and invariants
docs/
├── design-docs/
│ ├── index.md catalogue of all design docs with status
│ ├── core-beliefs.md agent-first operating principles
│ └── [feature-name].md individual design documents
├── exec-plans/
│ ├── active/ in-progress execution plans
│ ├── completed/ finished plans (kept for context)
│ └── tech-debt-tracker.md known debt, prioritized
├── product-specs/
│ ├── index.md catalogue of product specs
│ └── [feature-name].md individual specs with acceptance criteria
├── references/
│ ├── [library]-llms.txt LLM-friendly reference docs for key deps
│ └── [topic].md reference material agents may need
├── generated/
│ └── db-schema.md auto-generated from source of truth
├── DESIGN.md design system and patterns
├── FRONTEND.md frontend conventions
├── PLANS.md current priorities and roadmap
├── QUALITY_SCORE.md quality grades per domain/layer
├── RELIABILITY.md uptime, error budgets, SLOs
└── SECURITY.md security model and boundaries
Not every project needs all of this. Start with AGENTS.md + ARCHITECTURE.md and grow
the docs/ tree as the project demands it.
The Three Pillars
1. Agent Legibility
Optimize the codebase for the agent's ability to reason about it, not human aesthetics.
Prefer boring, composable technology. Technologies with stable APIs, good documentation, and broad representation in training data are easier for agents to model. "Boring" is a feature, not a limitation.
Inline over opaque. Sometimes reimplementing a small subset of functionality is better than depending on an opaque library the agent can't reason about. If the agent can read, test, and modify the code directly, it has more leverage than if it's calling into a black box.
Structured over unstructured. Typed boundaries, structured logging, schema-validated configs. Everything the agent interacts with should be queryable and parseable.
Parse data at system boundaries into precise types — don't let raw/untyped data flow
deep into business logic. For the full treatment, see the parse-dont-validate skill.
Everything in the repo. Design decisions, architectural rationale, product context, quality assessments — if it matters, it's a versioned markdown file checked into the repo. The moment a Slack thread resolves an architectural question, the conclusion goes into a design doc.
2. Progressive Disclosure
Agents should start with minimal context and drill deeper as needed. Don't dump everything into one file or into the initial prompt.
See references/progressive-disclosure.md for the full pattern, including directory structure, indexing, and cross-linking strategies.
3. Mechanical Enforcement
Encode architectural rules as linters and tests, not prose. Prose gets ignored; CI failures don't. When documentation falls short, promote the rule into code.
See references/mechanical-enforcement.md for patterns including custom linters with remediation messages, structural dependency tests, and layer architecture enforcement.
Entropy Management
Agent-generated code drifts. Agents replicate existing patterns — including bad ones. Without active maintenance, the codebase accumulates inconsistency and technical debt faster than a human-only codebase would.
See references/entropy-management.md for the golden principles pattern, recurring cleanup cadence, and quality scoring approach.
Companion Skills
This skill focuses on repository structure and documentation architecture. For specific topics, load these companion skills:
| Topic | Skill | What It Covers |
|---|---|---|
| Writing AGENTS.md | agents-md |
Structure, sections, anti-patterns for the entry point file |
| Architecture docs | architecture-md |
Codemap with boundaries, invariants, cross-cutting concerns |
| Type-driven boundaries | parse-dont-validate |
Parsing at boundaries, making illegal states unrepresentable |
Quick Reference
| Need | Where to Look |
|---|---|
| Agent entry point file | agents-md skill |
| Architecture codemap | architecture-md skill |
| Layer docs for agents to drill into | progressive-disclosure.md |
| Enforcing rules via linters/CI | mechanical-enforcement.md |
| Preventing codebase drift | entropy-management.md |
| Typing boundaries | parse-dont-validate skill |
Workflow: Setting Up an Agent-First Repo
For a new project:
- Create
AGENTS.md— dev commands, repo structure, boundaries (~100 lines) - Create
ARCHITECTURE.md— bird's eye view, codemap, cross-cutting concerns - Set up
docs/with at minimumdesign-docs/index.mdandcore-beliefs.md - Add mechanical enforcement — linter for dependency directions, structural tests
- Establish quality scoring — grade each domain/layer, track in
QUALITY_SCORE.md - Set up entropy management — recurring cleanup cadence, golden principles doc
For an existing project:
- Write
AGENTS.mdstarting from what's in CI config and CONTRIBUTING.md - Write
ARCHITECTURE.mdby exploring the codebase (use thearchitecture-mdskill) - Identify the top 3 architectural rules agents violate — encode as lints or tests
- Move critical design context from Slack/docs/wikis into versioned repo files
- Start a
QUALITY_SCORE.mdto track known gaps per module
Anti-Patterns
- Knowledge lives outside the repo — Slack threads, Google Docs, wikis, Notion pages. If the agent can't
catit, it doesn't exist. - One giant AGENTS.md — Monolithic instruction files crowd out actual task context. Use progressive disclosure.
- Rules as prose only — "Don't import from the API layer in workers" is ignored. A lint that fails CI is not.
- No quality tracking — Without explicit grades, drift is invisible until it's painful.
- Manual cleanup Fridays — Doesn't scale. Encode golden principles and automate scanning.
- Opaque dependencies — Libraries the agent can't read, test, or modify reduce leverage. Prefer transparent, in-repo code for critical paths.