Overnight Repo Auditor

A production-grade autonomous codebase auditor designed for Anthropic's Managed Agents runtime (14.5-hour task horizon). You invoke this skill before leaving work. By morning, you have a comprehensive, severity-rated report covering security vulnerabilities, performance bottlenecks, accessibility violations, dependency risks, and code quality issues across your entire repository.

This skill is built for unattended, long-running execution. It does not ask questions. It does not pause for confirmation. It runs to completion autonomously, writing structured findings to disk as it goes, so that even if execution is interrupted, partial results are preserved.

Architecture

Overnight Repo Auditor (Commander)
 |
 |-- Phase 1: Reconnaissance (sequential, ~5-10 minutes)
 |    |  Scan repo structure, identify languages, frameworks, config files
 |    |  Build a complete inventory of what exists
 |    |  Determine which audit modules are relevant
 |
 |-- Phase 2: Parallel Audit Deployment (5 specialist agents, simultaneous)
 |    |
 |    |-- Agent 1: Security Auditor
 |    |    Vulnerabilities, secrets, injection points, auth flaws, OWASP Top 10
 |    |
 |    |-- Agent 2: Performance Auditor
 |    |    N+1 queries, memory leaks, bundle size, render blocking, algorithmic complexity
 |    |
 |    |-- Agent 3: Accessibility Auditor
 |    |    WCAG 2.1 AA/AAA, ARIA, keyboard navigation, color contrast, screen reader
 |    |
 |    |-- Agent 4: Dependency Auditor
 |    |    CVEs, outdated packages, license compliance, supply chain risk, unused deps
 |    |
 |    |-- Agent 5: Code Quality Auditor
 |    |    Dead code, duplication, complexity, naming, error handling, test coverage gaps
 |    |
 |-- Phase 3: Report Compilation (sequential, ~5-10 minutes)
 |    |  Merge all agent reports into a single overnight-audit-report.md
 |    |  Deduplicate cross-agent findings
 |    |  Assign final severity ratings
 |    |  Generate executive summary with top-10 priority items
 |    |  Write the report to the repository root

Runtime Context

This skill is designed for Anthropic's Managed Agents infrastructure, which provides:

14.5-hour maximum task duration -- enough to audit codebases of 500K+ lines
Autonomous execution -- no user interaction required after launch
Background agent spawning -- parallel sub-agents run concurrently
Persistent file I/O -- agents write incremental results to disk throughout execution

The 14.5-hour window means this skill can be thorough in ways that interactive sessions cannot. It reads every file, checks every dependency, traces every import chain. It does not sample or skip. For very large repos (1M+ lines), individual audit agents may spawn their own sub-agents to parallelize within their domain.

Execution Protocol

Follow these steps exactly. Do not deviate. Do not ask the user for input at any point. If a decision is ambiguous, choose the more thorough option and document the choice in the report.

Phase 1: Reconnaissance

Before deploying any audit agents, build a complete picture of the repository. This phase runs sequentially in the Commander context.

Step 1.1: Repository Structure Scan

Scan the top-level directory structure and build a manifest.

Actions:
1. Run `ls -la` at the repo root to get top-level contents
2. Run `find . -type f | head -5000` to get a file listing (cap at 5000 for initial scan)
3. Run `find . -type f | wc -l` to get total file count
4. Run `find . -type d | wc -l` to get total directory count
5. Run `wc -l $(find . -type f -name "*.{js,ts,tsx,jsx,py,go,rs,java,rb,php,cs,swift,kt,c,cpp,h}" 2>/dev/null | head -500) 2>/dev/null | tail -1` to estimate total lines of code
6. Use Glob to find all config files: package.json, Cargo.toml, go.mod, pyproject.toml, Gemfile, composer.json, pom.xml, build.gradle, Makefile, Dockerfile, docker-compose.yml, .github/workflows/*, tsconfig.json, webpack.config.*, vite.config.*, next.config.*, .eslintrc*, .prettierrc*, tailwind.config.*

Step 1.2: Technology Stack Identification

From the config files and file extensions found, determine:

Primary languages -- Ranked by line count (e.g., TypeScript 45K lines, Python 12K lines)
Frameworks -- React, Next.js, Django, Rails, Express, FastAPI, Spring, etc.
Package managers -- npm, yarn, pnpm, pip, cargo, go modules, bundler, composer
Build tools -- webpack, vite, esbuild, turbopack, make, gradle, maven
CI/CD -- GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.
Infrastructure -- Docker, Kubernetes, Terraform, CloudFormation, serverless configs
Database -- Prisma schema, migrations folders, SQL files, ORM configs
Testing -- Jest, Pytest, Go test, RSpec, PHPUnit, test directories

Record all findings in a structured inventory object that will be passed to each audit agent.

Step 1.3: Audit Module Relevance Check

Not every audit applies to every repo. Determine which modules are relevant:

Module	Required When
Security	Always
Performance	Always
Accessibility	Repo contains HTML, JSX, TSX, Vue, Svelte, or template files
Dependency	Repo has any package manager lockfile or dependency manifest
Code Quality	Always

If accessibility is not relevant (e.g., a pure backend API or CLI tool), skip that agent and note it in the final report as "Not Applicable -- no frontend components detected."

Step 1.4: Write Reconnaissance Report

Write a file audit-workspace/00-reconnaissance.md in the repo root with:

# Reconnaissance Report
Generated: {timestamp}

## Repository Overview
- Total files: {count}
- Total directories: {count}
- Estimated lines of code: {count}
- Primary languages: {list with line counts}
- Frameworks: {list}
- Package managers: {list}

## Technology Stack
{detailed breakdown}

## Audit Plan
- Security: ACTIVE
- Performance: ACTIVE
- Accessibility: ACTIVE / NOT APPLICABLE (reason)
- Dependency: ACTIVE / NOT APPLICABLE (reason)
- Code Quality: ACTIVE

## File Inventory
{top-level directory tree}

This file serves as the shared context document for all audit agents.

Phase 2: Parallel Audit Deployment

Deploy all relevant audit agents simultaneously using the Agent tool. Every agent call MUST use run_in_background: true to enable parallel execution. Send ALL agent calls in a single message.

Each agent receives:

The full reconnaissance report (copy the content inline -- agents do not share filesystem context automatically)
Their specific audit instructions (below)
The output file path where they must write their findings
The severity rating rubric

Severity Rating Rubric (shared across all agents)

Every finding must be rated using this rubric. Agents must use these exact labels.

CRITICAL  -- Exploitable security vulnerability, data loss risk, production crash risk,
             or compliance violation that could result in legal/financial consequences.
             Requires immediate remediation before next deploy.
             Examples: SQL injection, exposed secrets, missing auth on sensitive endpoints,
             unpatched CVE with known exploit, GDPR violation.

HIGH      -- Significant issue that degrades security, performance, or user experience
             materially, but is not immediately exploitable or catastrophic.
             Should be addressed within the current sprint.
             Examples: Missing rate limiting, N+1 queries on high-traffic endpoints,
             missing alt text on primary content images, outdated dependency with
             high-severity CVE (no known exploit), functions over 200 lines.

MEDIUM    -- Issue that represents technical debt or best-practice violation.
             Does not cause immediate harm but will compound over time.
             Should be addressed within the current quarter.
             Examples: Missing error boundaries, console.log in production code,
             missing ARIA labels on decorative elements, minor version behind on
             dependencies, moderate cyclomatic complexity.

LOW       -- Minor improvement opportunity. Code smell, style inconsistency,
             or optimization that would improve maintainability.
             Address when touching the relevant code.
             Examples: Unused imports, inconsistent naming conventions, missing
             JSDoc on internal utilities, dependencies that could be lighter alternatives.

Structured Finding Format (shared across all agents)

Every individual finding must follow this format:

### [SEVERITY] Finding Title
- **File**: path/to/file.ts (lines 45-67)
- **Category**: {agent-specific category, e.g., "Injection", "Memory Leak", "Missing Alt Text"}
- **Description**: Clear explanation of what the issue is and why it matters.
- **Evidence**: The specific code, configuration, or pattern that constitutes the issue. Include relevant code snippets (keep under 10 lines; reference line numbers for longer blocks).
- **Impact**: What happens if this is not fixed. Be specific -- "could allow unauthorized access to user PII" not "security risk."
- **Recommendation**: Specific, actionable fix. Include code examples when helpful.
- **References**: Links to relevant documentation, CVE numbers, WCAG criteria, etc.