docs-agent-audit

Installation
SKILL.md

docs-agent-audit

Audit a docs site the way an AI agent actually consumes it, narrate the process so the user learns how agents work, and produce a ranked list of PR-sized improvements.

Core principles

  • Teach while working. The user wants to understand how agents consume docs. After each logical step (not each tool call), give a one-line "here's what I did and why" so they can follow the reasoning without drowning in raw output.
  • Agent-consumer POV, not stylistic review. Findings must be things that make an agent more or less likely to succeed at a real task. Grammar nits, tone, visual polish — out of scope.
  • PR-sized. Every finding should map to a change one person could ship in <1 day. Rank by (impact for agents) × (ease to ship).

Workflow

1. Frame the task

Ask the user (once) for:

  • The docs site URL (or confirm it from project context)
  • 1–3 realistic agent tasks to evaluate against, e.g. "an agent wants to scrape a JS-rendered page and get structured JSON." If they don't have one, propose 2–3 based on the product and confirm.

Pick tasks that exercise different surfaces (quickstart, API ref, error handling, SDK install).

2. Discovery pass — how agents find things

Run these in roughly this order, narrating briefly after each:

  1. llms.txt / llms-full.txt — fetch /llms.txt and /llms-full.txt at the root. Does it exist? Is it curated or autogenerated? Does it cover the surface area? Are the links stable?
  2. robots.txt and sitemap.xml — quick sanity check that agents aren't blocked and can enumerate pages.
  3. Root landing + nav — WebFetch the homepage/docs root. How well does markdown conversion preserve structure and code blocks?
  4. MCP server — does the product ship one? If yes, note it as a shortcut agents should prefer over scraping.

3. Task pass — walk each agent task end-to-end

For each task from step 1:

  1. Start from what an agent actually has: either a WebSearch query a user would type, or an llms.txt entry.
  2. WebFetch the page(s) an agent would land on.
  3. Evaluate against the agent-consumer rubric (see below).
  4. Narrate: "An agent looking for X would land here via Y. It would succeed/fail because Z."

4. Synthesize findings

Group observations into PR-sized items. For each:

  • Title (imperative, PR-ready)
  • Why it matters for agents (1 sentence)
  • Rough scope (files/sections touched)
  • Impact: H/M/L — how much it improves agent success rate
  • Effort: H/M/L — how much work to ship

Sort by impact desc, then effort asc. Present as a table in chat.

5. Offer next step

End with: "Want me to open a PR for #1?" — don't batch multiple PRs unless asked.

Agent-consumer rubric

When evaluating a page, check:

  • Discoverability: Is it in llms.txt? Does a realistic search query surface it? Is the URL stable and guessable?
  • Fetchability: Does WebFetch return clean markdown? Code blocks preserved with language tags? Tables survive? Is content server-rendered or does it require JS?
  • Canonicalization: One page per concept, or duplicates splitting signal?
  • Code examples: Present above prose? Copy-pasteable? Multiple languages where relevant? Show full request AND response?
  • Parameters/schemas: Every field documented with type, required/optional, example value?
  • Errors: Error codes enumerated in one place? Each with cause and remedy?
  • Versioning: Clear which version the page applies to? Deprecated paths flagged?
  • Examples end-to-end: Can an agent go from zero to a working call using only this page?

Narration style

  • One line after each logical step, not each tool call.
  • Format: "Fetched X → [key finding]. [What this means for agents]."
  • Skip narration for routine success ("page loaded fine"); narrate surprises, gaps, and teaching moments.
  • When introducing a new concept (llms.txt, MCP, etc.) on first use, give a one-sentence definition inline.

Scope boundaries

In scope: content structure, URL design, llms.txt, code examples, error docs, schema completeness, MCP availability, fetchability.

Out of scope unless user asks: visual design, marketing copy, i18n/translations (this repo explicitly excludes localized files), SEO for humans, analytics.

Cross-session continuity

The user runs this across multiple sessions. At the start of a run, if the user references prior findings, check auto-memory for docs-agent-audit project entries. At the end of a run, if findings are worth carrying forward (e.g. "user decided to defer #3 until Q3"), offer to save a project memory — don't save unilaterally.

Related skills
Installs
1
GitHub Stars
1
First Seen
Apr 23, 2026