audit-context

Installation
SKILL.md

audit-context

Diagnose a nao context. Find gaps, MECE violations, failure root causes, and bloat. Output is a short in-conversation report ending in a prioritized plan. Diagnose only — never fix. Route fixes to write-context-rules / add-semantic-layer / create-context-tests.

Run any time: right after setup-context, mid-build, before a release, or when the agent's behavior gets surprising.

Six checks (run in order)

1. Synced context

Read nao_config.yaml. What's wired in (warehouse, repos, Notion, semantic layer, MCPs)? What's missing (dbt repo, ETL configs, BI repo, internal docs)? Has nao sync run — are databases/, repos/, docs/, semantics/ populated?

Scope check: <100 tables is the hard ceiling, ≤20 is the target. Better 12 well-documented tables than 80 half-documented ones. Flag oversized scope explicitly — it's the biggest predictor of reliability failure.

2. RULES.md vs target structure

Six standard sections (from write-context-rules): Business overview, Data architecture, Core data models (Most Used + Tables detail), Key Metrics Reference, Date filtering, Analysis Process. Per section, mark present / missing / thin. Flag placeholders, TODO: markers, and metric entries with no source-of-truth pointer.

3. Context coverage (per table)

For every table in databases/: is it in ## Most Used Tables? Does it have a ## Tables detail block? Is there dbt context (repos/<dbt>/models/**/schema.yml)? Any extra .md?

Then per-table gaps: undocumented columns the agent will reference, calculated fields with no explanation, foreign keys with no relation, common WHERE filters not mentioned anywhere. A table with no docs anywhere is a high-priority finding.

4. Data model consistency (MECE)

  • Mutually exclusive? Two tables computing the same metric differently (worst issue — the agent picks one unpredictably).
  • Collectively exhaustive? Asked metrics that no in-scope table can answer.
  • Duplicated columns? Same logical field under different names (user_id / customer_id / account_id).
  • Ambiguous columns? amount without unit, status without enum values.

5. Test coverage

If tests/ is empty → recommend create-context-tests. Otherwise read tests/outputs/ (most recent run) and categorize each failure:

Category Looks like Fix
Data model Wrong column / wrong table Add column descriptions; clarify granularity
Date selection Wrong period / week start Add DO/DON'T SQL in ## Date filtering
Test issue Test SQL itself is wrong Fix the test, not the context
Interpretation Reasonable but different reading Add to naming conventions or ## Key Metrics Reference
Metric definition Wrong formula / source Tighten ## Key Metrics Reference or add a semantic layer

Propose the smallest rule change per failure. Sort by impact (tests affected).

6. Token optimization

  • Files >40KB (flag).
  • ## Tables detail blocks exceeding the 10-column cap.
  • Duplication between RULES.md and databases/<table>.md.
  • In-scope tables with no mention in any test or recent question (trim candidates).
  • Raw / staging tables that snuck into scope.

If RULES.md is bloated, suggest moving per-table detail to databases/<table>.md and keeping only the one-line pointer in ## Most Used Tables. For multi-domain bloat, propose a per-domain file map referenced from RULES.md. Show the proposed structure before moving anything.

Output (in conversation, not a file)

Lead with a one-paragraph summary: sync state | scope wideness (N tables vs ≤100 ceiling) | rules quality (N/6 sections substantive) | test coverage (N tests, X% passing).

Then deep-dive only the sections with findings. Skip clean ones. Format hints:

  • Synced / RULES.md / token bloat → bulleted gaps.
  • Context coverage → table: Table | RULES.md | dbt docs | Extra .md | Gap.
  • MECE → bullets.
  • Test failures → table: Test | Category | Proposed fix.

End with a prioritized plan (easiest-win → biggest-work), each item naming the skill that does the work:

## Plan
1. (easy / 5 min) ... → write-context-rules
2. (small / 30 min) ... → create-context-tests
3. (medium / 1-2 hr) ... → audit-context (rerun after)
4. (large / multi-session) ... → add-semantic-layer

Guardrails

  • Apply one change at a time. Re-run tests between fixes.
  • Tests are the source of truth. If the user says "it's working," ask for the latest pass rate first.
  • Don't move or split files without confirmation. Show the file map first.
  • Don't fix in this skill — diagnose only.
Related skills

More from getnao/nao

Installs
20
Repository
getnao/nao
GitHub Stars
1.1K
First Seen
8 days ago