Memory Quality Auditor

Audit the memory system as a unified retrieval layer (STM/MTM/LTM files + index + spawn citation outcomes).

Scope

ALWAYS establish a baseline metric snapshot before auditing — drift is only meaningful relative to a prior measurement; auditing without a baseline produces absolute numbers that cannot identify regression.
NEVER close a memory finding without re-running the affected retrieval query — closing without verification creates false improvement metrics and masks persistent degradation.
ALWAYS include citation-groundedness checks in every audit run — uncited memory injections are the primary source of hallucination in agent spawns; skipping this check leaves the highest-risk failure mode undetected.
NEVER audit only the STM tier — degradation often originates in MTM/LTM promotion corruption; all three tiers must be sampled in every full audit cycle.
ALWAYS emit TDD-ready remediation items with a failing-test condition and expected metric threshold — vague findings ("memory quality is low") cannot be actioned by any agent.

Anti-Pattern	Why It Fails	Correct Approach
Auditing without a baseline	Cannot distinguish regression from steady-state; all findings are ambiguous	Snapshot current metrics at session start; compute delta against the previous run
Closing findings without re-check	Produces false-positive resolution; degradation persists silently behind green metrics	Re-run the specific retrieval query after each remediation; close only on confirmed green metric
Skipping citation groundedness	Citation failures are the leading cause of agent hallucination; missing this check omits the highest-severity defect class	Include `citation_coverage` and `grounded_ratio` metrics in every audit report
Full-mode audit on every spawn	Full audit is expensive; running it unconditionally inflates cost and slows workflows	Use `--mode summary` for routine checks; reserve `--mode full` for scheduled or triggered audits
Auditing STM only	MTM/LTM corruption is invisible in STM-only scans; stale LTM entries contaminate future sessions	Sample all three tiers: STM (current session), MTM (last 10 sessions), LTM (permanent summaries)

Before starting: Read .claude/context/memory/learnings.md

After completing:

ASSUME INTERRUPTION: If it's not in memory, it didn't happen.