doc-accuracy

Installation

SKILL.md

Documentation Accuracy Skill

Verify documentation claims against actual code behavior. Code is truth; docs are the subject under test.

Triggers

Trigger Phrase	Operation
`check documentation accuracy`	Full audit (Phases 1-6)
`verify code examples`	Compilability check (Phases 1-3)
`audit docs vs code`	Behavioral verification (Phases 1-4)
`check doc consistency`	Cross-document consistency (Phases 1-2, 5)
`run doc-accuracy`	Full audit (Phases 1-6)

When to Use

Use this skill when:

Documentation may contain non-compilable code examples
Behavioral claims in docs may contradict implementation
Quantitative claims (performance, limits) appear in multiple files with different values
Preparing for a release and need a documentation accuracy audit
API reference may be missing public members

Use direct code review instead when:

Investigating a single known documentation error
The inaccuracy is already identified and needs a fix

Replaces

Skill	Reason
`incoherence`	15.8% recall on critical issues; Haiku agents too shallow
`doc-coverage`	0% recall on actionable issues; checks presence, not correctness
`doc-sync`	No scripts, purely manual LLM workflow
`comment-analyzer`	Advisory only, single-file scope

Architecture

Asymmetric Verification

Code compiles and runs. Documentation describes what code does. When they disagree, the code is right. This skill reads code first, builds a verified model, then checks documentation claims against that model.

Phase Overview

Phase 1: Assessment        (script-only, <30s)  -> assessment.json
Phase 2: Claim Extraction  (script-only, <15s)  -> claims.json
Phase 3: Compilability     (script-only, <60s)  -> compilability-findings.json
Phase 4: Behavioral        (Sonnet agents, 3-7m) -> behavioral-findings.json
Phase 5: Cross-Document    (script + Sonnet, 1-2m) -> consistency-findings.json
Phase 6: Structure         (Sonnet agent, 1-2m)  -> structure-findings.json

Scripts

Script	Purpose
`scripts/doc_accuracy.py`	Phases 1-3: Assessment, claim extraction, compilability check

Invocation

# Full deterministic scan (Phases 1-3)
python3 scripts/doc_accuracy.py --target /path/to/repo

# Compilability only
python3 scripts/doc_accuracy.py --target /path/to/repo --phases 3

# Incremental (changed files only)
python3 scripts/doc_accuracy.py --target /path/to/repo --diff-base main

# JSON output to specific directory
python3 scripts/doc_accuracy.py --target /path/to/repo --output-dir .doc-accuracy

# Set severity threshold for exit code
python3 scripts/doc_accuracy.py --target /path/to/repo --severity-threshold critical

# Markdown report output
python3 scripts/doc_accuracy.py --target /path/to/repo --format markdown

# Text summary to stdout
python3 scripts/doc_accuracy.py --target /path/to/repo --format summary

Output Artifacts

File	Description
`assessment.json`	Phase 1: doc/source inventory with symbol index
`claims.json`	Phase 2: verifiable claims extracted from docs
`compilability-findings.json`	Phase 3: symbol resolution findings
`gate-result.json`	Gate verdict with severity counts
`report.md`	Markdown summary (when `--format markdown`)

Process

Phases 1-3: Deterministic (Script)

Run doc_accuracy.py to produce JSON artifacts. No LLM calls.

Phase 1 (Assessment): Enumerate documentation and source files. Extract public symbols via regex. Build doc-to-source mapping.
Phase 2 (Claim Extraction): Parse markdown files. Extract verifiable claims with file path, line number, claim type, and referenced symbols.
Phase 3 (Compilability): Verify type names, method names, parameter names in code examples exist in the codebase via symbol index lookup.

Phase 4: Behavioral Verification (Agent)

Dispatch one Sonnet agent per file group. Each agent receives:

Full source file content (read code FIRST)
Documentation file content
Filtered claims from claims.json
Compilability findings (to avoid re-checking)

Agent prompt:

You are verifying documentation accuracy. Code is the source of truth.

SOURCE FILES (read these first):
[full content of mapped source files]

DOCUMENTATION FILE:
[full content of the documentation file]

CLAIMS TO VERIFY:
[filtered claims from claims.json]

For each claim:
1. Find the relevant code in the source files
2. Determine if the documentation claim accurately describes the code behavior
3. If inaccurate: severity, description, evidence (with line numbers), suggested fix
4. If accurate: mark as PASS

Additionally check for:
- Public API members in source absent from documentation
- Behavioral nuances the documentation omits or misrepresents
- Default values that differ between docs and code

Launch agents in parallel (one per file group).

Phase 5: Cross-Document Consistency (Agent)

From claims.json, filter to quantitative and behavioral claims. Group by topic. For groups with conflicting values across files, dispatch a Sonnet agent with benchmark data to determine the correct value.

Phase 6: Structure and Quality (Agent)

Validate documentation structure (indexes, navigation, completeness). Apply comment quality framework (accuracy, completeness, long-term value, misleading elements, improvements) to a 20% sample of source comments.

Reconciliation (Interactive)

Present proposed fixes for user approval before modifying any file. Categories:

Documentation prose fixes
Code example corrections
Consistency resolution (single source of truth)
Structural updates (indexes, navigation)

Issue Taxonomy

Class	Description	Detection Phase
1: Spec vs Behavior	Docs say X, code does Y	Phase 4
2: Non-Compilable Code	Code examples reference nonexistent symbols	Phase 3
3: Cross-Doc Inconsistency	Same fact, different values across files	Phase 5
4: Domain Violations	Technology convention violations (OTel, Prometheus)	Phase 4 + Plugins
5: API Surface Gaps	Public API exists but is undocumented	Phase 3 + Phase 4

Severity Levels

Level	Definition
Critical	Code will not compile, or behavior is silently wrong
High	Materially misleading but no immediate failure
Medium	Inconsistent or confusing but correct in at least one location
Low	Cosmetic, improvement opportunity, or minor omission

Exit Codes

Code	Meaning
0	No findings at or above severity threshold
1	Error (file not found, parse error)
10	Findings at or above severity threshold

Anti-Patterns

Avoid	Why	Instead
Trusting documentation as peer of code	Docs and code are not equal; code is truth	Always read implementation before checking docs
Using Haiku for behavioral verification	15.8% recall vs 100% with Sonnet	Use Sonnet agents for Phase 4
One agent per dimension	Loses cross-cutting context	One agent per file group
Skipping Phase 1 for Phase 4	Agents need symbol index for precise verification	Always run Phases 1-2 first
Running all phases on unchanged files	Wastes time and tokens	Use `--diff-base` for incremental checks

Verification

After running:

All JSON artifacts created in output directory
Every finding has file path, line number, severity, and evidence
Code examples verified against actual method signatures
Behavioral claims verified by reading implementation source
Cross-document conflicts identified with all locations listed
Exit code reflects severity threshold

Related Skills

Skill	Relationship
`incoherence`	Replaced: Detection logic superseded by Phases 3-5
`doc-coverage`	Replaced: Symbol extraction logic preserved in Phase 1
`doc-sync`	Replaced: Structural audit absorbed into Phase 6
`analyze`	Complementary: broader codebase analysis
`style-enforcement`	Complementary: code style checks

Related skills

More from rjmurillo/ai-agents

Installs

Repository

rjmurillo/ai-agents

GitHub Stars

First Seen

13 days ago

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykFail