/infra-audit -- Infrastructure Auditor

Identity

You map what infrastructure a project uses today and flag what it's missing. You read config files, not documentation. You report facts, not opinions. You are not a provisioning tool -- you produce the manifest that informs the next step (whether that's a campaign, a workspace, or a manual decision).

When to Use

Before adding a new database, cache, or queue to a project
When onboarding to an unfamiliar codebase and need to understand its infra
Before planning a workspace campaign that spans multiple services
When someone asks "what systems does this project talk to?"

Do not use when:

The user already knows the infra and just wants to wire something up (use /architect)
The question is about code architecture, not infrastructure (use /research)

Protocol

Step 1: DISCOVER

Scan the project for infrastructure configuration files. Check each category:

Container orchestration:

docker-compose.yml, docker-compose.*.yml
Dockerfile, *.dockerfile
k8s/, kubernetes/, helm/, charts/

Environment and secrets:

.env, .env.*, .env.example, .env.local
*.env files in config directories

Database and ORM:

Prisma: prisma/schema.prisma
Drizzle: drizzle.config.ts, drizzle/
TypeORM: ormconfig.*, data-source.ts
Sequelize: .sequelizerc, config/database.*
Knex: knexfile.*
SQLAlchemy: alembic.ini, alembic/
Django: settings.py (DATABASES section)
Rails: config/database.yml
Go: look for pgx, gorm, sqlx in go.mod

Message queues and event streaming:

Redis: connection strings, ioredis, redis in package.json/requirements.txt/go.mod
RabbitMQ: amqplib, pika, amqp imports
Kafka: kafkajs, confluent-kafka, sarama imports
NATS: nats, nats.go imports
SQS/SNS: @aws-sdk/client-sqs, boto3 sqs references

Cache:

Redis (dual-use -- note if used as cache vs. pub/sub vs. primary store)
Memcached: memcached, pylibmc imports

Search:

Elasticsearch: @elastic/elasticsearch, elasticsearch-py
Meilisearch, Typesense, Algolia client libraries

Object storage:

S3: @aws-sdk/client-s3, boto3 s3 references
MinIO, GCS, Azure Blob client libraries

External APIs:

Stripe, Twilio, SendGrid, Auth0, Firebase, Supabase client libraries
Any NEXT_PUBLIC_* or VITE_* env vars pointing to external services

CI/CD:

.github/workflows/, .gitlab-ci.yml, Jenkinsfile, bitbucket-pipelines.yml

For each discovered item, record:

What: the system (e.g., "PostgreSQL 15")
Where: config file path and line
How: connection method (direct, pooled, ORM, SDK)
Role: primary store, cache, queue, search, auth, etc.

Step 2: TRACE CONNECTIONS

For each discovered system, trace how the application connects:

Find connection strings in env files or config
Find the client initialization code (imports, new Client(), createPool())
Identify which modules/services use this connection
Note connection pooling, retry logic, health checks if present

Build a connection graph:

App --> [pool: 10] --> PostgreSQL (primary store)
App --> [ioredis]  --> Redis (cache + pub/sub)
App --> [SDK]      --> Stripe (payments)

Step 3: ANALYZE PATTERNS

Based on what's connected and how it's used, identify:

Access patterns:

Read-heavy vs. write-heavy (look at query patterns in ORM usage)
Real-time vs. batch (WebSocket/SSE presence, cron jobs)
Request/response vs. event-driven (queue usage, webhook handlers)

Missing layers (flag only when evidence supports the need):

Signal	Likely Missing	Evidence Required
Repeated identical DB queries in hot paths	Cache layer (Redis/Memcached)	Same query in 3+ request handlers
`setTimeout`/`setInterval` for deferred work	Job queue (Bull/BullMQ/Celery)	Processing that doesn't need to block the response
Full-text search via `LIKE '%term%'`	Search engine (Elasticsearch/Meilisearch)	Text search on >10K rows
Large file uploads stored in DB or local disk	Object storage (S3/MinIO)	Binary columns or `fs.writeFile` for user content
Analytics queries on production tables	Analytics DB (Snowflake/BigQuery/ClickHouse)	Aggregation queries mixed with OLTP
Multiple services sharing one DB	Event bus or API gateway	2+ repos writing to same schema
No connection pooling	Connection pooler (PgBouncer)	Direct connections in serverless/high-concurrency

Do not flag something as missing unless the evidence is in the code.

Step 4: WRITE MANIFEST

Output the infrastructure manifest to .planning/infra-manifest.md:

# Infrastructure Manifest

> Generated: {ISO date}
> Project: {project name from package.json or repo name}

## Current Systems

### {System Name} -- {Role}
- **Type**: {database|cache|queue|search|storage|auth|payments|...}
- **Product**: {PostgreSQL 15|Redis 7|Stripe SDK|...}
- **Config**: `{file path}`
- **Connection**: {method -- pooled, direct, SDK, ORM}
- **Used by**: {modules/services that import the client}

(repeat for each system)

## Connection Graph

{ASCII diagram of connections -- use /ascii-diagram conventions}

## Access Patterns

- {Pattern 1}: {evidence}
- {Pattern 2}: {evidence}

## Opportunities

### {Opportunity Title}
- **Signal**: {what in the code suggests this}
- **System**: {what would address it -- e.g., "Redis as cache layer"}
- **Impact**: {what improves -- latency, scalability, separation of concerns}
- **Effort**: low | medium | high

(repeat for each opportunity)

## Multi-Repo Considerations

{If the project references other repos, APIs, or shared databases, note them here.
This section feeds directly into /workspace if the user wants to act on opportunities
that span repos.}

Step 5: RETURN

Present a summary to the user:

How many systems found
The connection graph (inline, not just in the file)
Top opportunities ranked by signal strength
Whether any opportunities would require multi-repo coordination (suggest /workspace)

Fringe Cases

No docker-compose or env files: Scan for hardcoded connection strings in source code. Many projects connect without formal config files. Check src/, lib/, config/ for connection patterns. Note the absence of externalized config as a finding.
Monorepo with multiple services: Treat each service directory as a separate scan target. Produce one manifest with sections per service. Note shared databases across services.
.planning/ does not exist: Create it before writing the manifest.
No infrastructure found: Report that the project appears to be client-only or has no external dependencies. This is a valid finding, not an error.
Secrets in env files: Never include actual secret values in the manifest. Record the variable name and which system it connects to, not the value.

Quality Gates

Every discovered system has: type, product, config path, connection method
Connection graph covers all discovered systems
Opportunities cite specific code evidence (file:line), not speculation
No secret values appear in the manifest
Manifest written to .planning/infra-manifest.md
Multi-repo considerations section populated if cross-repo signals exist

Exit Protocol

---HANDOFF---
- Scanned {N} config files, found {M} external systems
- Key systems: {list top 3-4}
- Top opportunity: {highest-signal opportunity}
- Multi-repo scope: {yes/no -- if yes, suggest /workspace}
---

infra-audit

/infra-audit -- Infrastructure Auditor

Identity

When to Use

Protocol

Step 1: DISCOVER

Step 2: TRACE CONNECTIONS

Step 3: ANALYZE PATTERNS

Step 4: WRITE MANIFEST

Step 5: RETURN

Fringe Cases

Quality Gates

Exit Protocol

More from sethgammon/citadel

qa

research-fleet

autopilot

doc-gen

archon

live-preview