Familiarizes with the project (investigation phase)
Audits using parallel agents across codebase divisions
Creates GitHub issues for each discovered problem 3.5. Validates against recent PRs (prevents false positives)
Generates PRs in parallel worktrees per remaining issues
Reviews PRs with PM architect for prioritization
Reports consolidated recommendations in master issue

Quick Start

User: "Run a quality audit on this codebase"
Skill: *activates automatically*
       "Beginning quality audit workflow..."

The 7 Phases

Phase 1: Project Familiarization

Run investigation workflow on project structure
Map modules, dependencies, and entry points
Understand existing patterns and architecture

Phase 2: Parallel Quality Audit

Divide codebase into logical sections
Deploy multiple agent types per section (analyzer, reviewer, security, optimizer)
Apply PHILOSOPHY.md standards ruthlessly
Check module size, complexity, single responsibility

Phase 3: Issue Assembly

Create GitHub issue for each finding
Include severity, location, recommendation
Tag with appropriate labels
Add unique IDs, keywords, and file metadata

Phase 3.5: Post-Audit Validation [NEW]

Scan merged PRs from last 30 days (configurable)
Calculate confidence scores for PR-issue matches
Auto-close high-confidence matches (≥90%)
Tag medium-confidence matches (70-89%) for verification
Add bidirectional cross-references between issues and PRs
Target: <5% false positive rate

Phase 4: Parallel PR Generation

Create worktree per remaining open issue (worktrees/fix-issue-XXX)
Run DEFAULT_WORKFLOW.md in each worktree
Generate fix PR for each confirmed open issue

Phase 5: PM Review

Invoke pm-architect skill
Group PRs by category and priority
Identify dependencies between fixes

Phase 6: Master Report

Create master GitHub issue
Link all related issues and PRs
Prioritized action plan with recommendations

Philosophy Enforcement

This workflow ruthlessly applies:

Ruthless Simplicity: Flag over-engineered modules
Module Size Limits: Target <300 LOC per module
Single Responsibility: One purpose per brick
Zero-BS: No stubs, no TODOs, no dead code
Anti-Fallback (#2805, #2810): Detect silent degradation and error swallowing patterns
Structural Analysis (#2809): Flag oversized files, deeply nested code, and tangled dependencies

Detection Categories

Standard Categories

Category	What It Detects
Security	Hardcoded secrets, missing input validation, string interpolation in queries
Reliability	Missing timeouts, bare except clauses, unhandled async
Dead Code	Unused imports, unreachable branches, stale TODOs
Test Gaps	Files without tests, tests without assertions
Doc Gaps	Public functions without docstrings, outdated docs

Extended Categories

| Category | What It Detects | | ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --- | -------------------------- | | Silent Fallbacks | except: pass, broad catches that return defaults silently, fallback chains that mask failures, ?? defaultValue hiding missing config, dict.get(key, default) on required values, | | fallback in shell scripts | | Error Swallowing | Catch blocks with no re-raise/re-throw, error-to-None/null transforms, catch-all discarding exceptions, log-only catch blocks, empty catch blocks, catch (Exception) returning false/default/empty collection | | Result Dropping | Fire-and-forget async (_ = Task(), asyncio.create_task() without error handling), unchecked HTTP response status, discarded return values, Task.WhenAll/Promise.all/asyncio.gather without individual failure checks, unchecked subprocess.run() | | Shell Anti-Patterns | \|\| true, >/dev/null 2>&1, 2>/dev/null, set +e, \|\| fallback_command, missing set -euo pipefail | | Silent Truncation | Take(N)/[:N]/.slice(0,N) without logging, .Where()/list comprehensions that silently drop items that should be processed, string substring without bounds logging | | Async Anti-Patterns | async void (C#), .Result/.Wait() sync-over-async, unawaited coroutines/promises, shared mutable state without synchronization, CancellationToken not propagated, Timer/CancellationTokenSource not disposed | | Config Divergence | Env vars defined in deploy configs but read with silent fallbacks in code, IsDevelopment() guards that could leak to staging/prod, services expecting config that infrastructure doesn't provide | | Validation Gaps | API endpoints without input validation, string interpolation in SQL/GraphQL/Cypher, missing pagination limits, missing request size limits, enum parsing from user input without validation, trusting deserialized external data without null checks | | Health & Observability | Degraded reported when Unhealthy is appropriate, background worker failures not surfaced to /health, log-only error handling without metrics, permanent errors treated as transient (retried instead of dead-lettered), partial success marked as full success | | Retry Anti-Patterns | Retry loops that fall through silently after exhaustion, circuit breakers that open without alerting, retry logic that eventually gives up without raising the last error | | Structural Issues | Files >500 LOC, functions >50 lines, nesting >4 levels, >5 parameters, circular imports | | Documentation | Point-in-time content, unprofessional tone (pirate speak, chatbot artifacts), quality/correctness gaps | | Hardcoded Limits | Non-configurable numeric caps ([:N], max_X = N), silent truncation without logging, data loss from processing limits |

Multi-Agent Validation (v3.0)

Every finding is validated by 3 independent agents (analyzer, reviewer, architect). A finding is confirmed only if ≥2 agents agree. This eliminates false positives before any fixes are attempted.

Iterative Loop with Escalating Depth (v3.0)

Cycle 1: SEEK → VALIDATE (3 agents) → FIX → decision
Cycle 2: SEEK (deeper) → VALIDATE → FIX → decision
Cycle 3: SEEK (deepest) → VALIDATE → FIX → decision
...continues if thresholds not met

Loop rules:

Minimum 3 cycles always run
Continue past 3 if: any high/critical NEW findings emerged, or >3 medium NEW findings
Maximum 6 cycles (safety valve)
Each cycle: fresh eyes, dig deeper, challenge prior findings
Fixes use the full DEFAULT_WORKFLOW approach (understand → test → implement → verify)
Fix-all-per-cycle rule (#2842): Every confirmed finding in a cycle MUST be fixed before the cycle is complete. No partial cycles. No deferring findings to "follow-up issues" or "next cycle". If SEEK finds issues, FIX must address ALL of them.
Loop decision based on NEW findings (#2842): The decision to continue is based on whether the current cycle discovered NEW issues, not whether old issues remain unfixed (they shouldn't — the fix-all rule prevents that).
Fix verification step: After fixes, a verification step compares confirmed findings against fix results to ensure nothing was skipped.

Run via recipe:

amplihack recipe execute quality-audit-cycle.yaml --context '{"target_path": "src/amplihack", "min_cycles": "3", "max_cycles": "6"}'

Configuration

Override defaults via recipe context or environment:

Structured Inputs (recipe context, per #2843):

Input	Default	Description
`target_path`	`src/amplihack`	Directory to audit
`min_cycles`	`3`	Minimum audit cycles
`max_cycles`	`6`	Maximum cycles (safety valve)
`validation_threshold`	`2`	Min validators that must agree (out of 3)
`severity_threshold`	`medium`	Minimum severity to report
`module_loc_limit`	`300`	Flag modules exceeding this LOC
`fix_all_per_cycle`	`true`	Must fix ALL findings before next cycle (#2842)
`categories`	(all)	Comma-separated list of categories to check

Available Categories: security, reliability, dead_code, silent_fallbacks, error_swallowing, result_dropping, shell_anti_patterns, silent_truncation, async_anti_patterns, config_divergence, validation_gaps, health_observability, retry_anti_patterns, structural, hardcoded_limits, test_gaps, doc_gaps, documentation

Example invocation:

amplihack recipe execute quality-audit-cycle.yaml --context '{
  "target_path": "src/amplihack/fleet",
  "min_cycles": "3",
  "max_cycles": "6",
  "severity_threshold": "medium",
  "module_loc_limit": "300",
  "fix_all_per_cycle": "true",
  "categories": "security,reliability,dead_code,silent_fallbacks,error_swallowing"
}'

Core Settings (environment):

AUDIT_PARALLEL_LIMIT: Max concurrent worktrees (default: 8)

Phase 3.5 Validation Settings:

AUDIT_PR_SCAN_DAYS: Days to scan for recent PRs (default: 30)
AUDIT_AUTO_CLOSE_THRESHOLD: Confidence % for auto-close (default: 90)
AUDIT_TAG_THRESHOLD: Confidence % for tagging (default: 70)
AUDIT_ENABLE_VALIDATION: Enable Phase 3.5 (default: true)

quality-audit

Quality Audit Workflow

Purpose

When I Activate

What I Do