Architecture Exploration

Reason deeply about a system, generate real architectural options, and help the user choose the best direction before any migration begins.

This skill is intentionally pre-commit. Its job is to help pick the right architecture, not to execute the migration. Once a direction is chosen, hand the work off to audit-and-migrate.

Why This Exists

Large engineering efforts fail long before code is written:

Premature commitment — a promising idea becomes the default architecture before alternatives are seriously examined.
Local reasoning — the agent optimizes one subsystem without understanding system-wide constraints, ownership, and failure modes.
Strawman comparison — one favored option is compared against weak alternatives, creating false confidence.
Migration contamination — the discussion quietly shifts from "what should we build?" to "how should we implement it?" before the target architecture is actually chosen.

This skill exists to force disciplined exploration before implementation.

Hard Boundary

This skill does not:

create migration slices
create ratchet budgets for execution
start refactoring
write scaffolding or implementation code
turn the leading option into the assumed winner before comparison is complete

This skill does:

map the current system
define the decision frame
generate multiple serious options
compare them against the same constraints
stress-test them
recommend a direction
produce a handoff package for audit-and-migrate

Perform all of this work directly. Do not rely on other skills to do the shaping, stress-testing, or architecture comparison for you.

If the user has already chosen a direction and wants to land it safely, stop using this skill and use audit-and-migrate.

Core Principle

First excavate reality, then compare options, then recommend.

Do not start with architecture taste. Start with the actual system, the actual constraints, and the actual problem.

Evidence Discipline

Before you recommend an architecture, you must earn the recommendation.

For the relevant system, actively inspect:

current code paths
ownership boundaries
state and data flow
external contracts and integrations
docs and operational surfaces
obvious history signals when needed (recent churn, known incidents, partial prior refactors)

Do not recommend major boundary changes based only on the user's summary if the local codebase can answer the question.

When a claim is uncertain, mark it as uncertain. A high-integrity architectural recommendation is one that is well-calibrated, not one that sounds maximally confident.

Workflow

Phase 1: Define the Decision Frame

Before exploring options, pin down the decision you are actually making.

Capture:

Goal — what outcome the user wants
Problem — what pain or failure mode motivates the change
Invariants — what must remain true
Non-goals — what should stay out of scope
Constraints — team, tooling, performance, compliance, time, compatibility, ops, or organizational limits
External surfaces — APIs, env vars, queues, dashboards, CLI entrypoints, jobs, data contracts, webhooks, partner integrations
Decision horizon — are we optimizing for the next 3 months, 1 year, or 3 years?

If the user already has a candidate solution in mind, treat it as Option A, not as the conclusion.

Phase 2: Map the Current System

Understand the current system before proposing alternatives.

For the affected system or systems, map:

Primary workflows
Current module boundaries
Data flow
Ownership of logic and state
External dependencies
Operational surfaces
Known pain points
Hotspots — fragile areas, high-churn files, flaky tests, unclear ownership, duplicated logic, stale docs

Produce a concise current-state map:

## Current System
| Area | Current Owner | Inputs | Outputs | Dependencies | Pain |
|------|---------------|--------|---------|--------------|------|
| ...  | ...           | ...    | ...     | ...          | ...  |

Do not skip this because the user "already knows the system." A rigorous option comparison depends on a shared map of reality.

Phase 3: Generate the Option Set

Generate 2-4 serious architectural options.

Do not generate fake alternatives. Every option must be plausible enough that a strong team could reasonably choose it.

When feasible, include:

Option 1: Evolutionary path — improves the current system with lower disruption
Option 2: Simplifying path — removes concepts and indirection aggressively
Option 3: Structural path — introduces stronger boundaries or a new architecture shape
Option 4: Do less — if the problem may be solvable with narrower change than expected

Each option must describe:

architecture shape
boundary changes
ownership model
data flow
operational model
what stays
what changes
what gets simpler
what gets harder

Phase 4: Analyze Each Option

For every option, run the same analysis.

A. Fit

How well does the option satisfy:

the goal
invariants
constraints
external surface obligations

B. Assumptions

List the must-be-true assumptions:

## Must-Be-True Assumptions
| Assumption | Why It Matters | How to Verify | Fastest Disproof |
|------------|----------------|---------------|------------------|
| ...        | ...            | ...           | ...              |

C. Failure Modes

Imagine the option failed one year later. Work backward.

## Pre-Mortem
| Failure Mode | Warning Signal | Prevention |
|--------------|----------------|------------|
| ...          | ...            | ...        |

D. Tradeoffs

Evaluate each option on:

Concept count — how many new ideas someone must carry
Boundary clarity — is ownership sharper or blurrier?
Migration difficulty — how hard this will be to land later
Cleanup burden — how likely it is to leave vestigial code, docs, configs, or adapters
Rollback story — how hard it is to back out
Operability — monitoring, debugging, failure handling, support burden
Testability — can the system be verified deterministically?
Extensibility — what future changes become easier?
Lock-in — what new constraints does this create?

Use relative ratings with justification. Avoid fake precision.

## Tradeoff Matrix
| Dimension | Option A | Option B | Option C |
|-----------|----------|----------|----------|
| Simplicity | Medium — ... | High — ... | Low — ... |
| Migration Difficulty | ... | ... | ... |
| Cleanup Burden | ... | ... | ... |
| Operability | ... | ... | ... |
| Testability | ... | ... | ... |
| Long-Term Flexibility | ... | ... | ... |

E. Disqualifiers

For each option, state what would make it the wrong choice.

Examples:

requires compatibility the project does not need
increases operational burden beyond the team's capacity
creates too much cleanup debt during migration
depends on an external contract the team does not control
adds concepts without enough payoff

F. Unknowns

What remains uncertain? Distinguish:

architectural unknowns — boundary or ownership uncertainty
runtime unknowns — load, latency, concurrency, failure behavior
organizational unknowns — team readiness, external consumers, ops constraints

Phase 5: Recommend

Do not stop at listing options. Make a recommendation.

The recommendation must include:

Recommended option
Why it wins
Why the runner-up loses
Why the other options were rejected
What could change the recommendation
What must be validated before committing

If the recommendation is genuinely unclear, say so and explain exactly which uncertainty blocks a good decision.

Phase 6: Define Validation Spikes

Before committing, propose the cheapest high-value spikes to kill uncertainty.

Good spikes are:

narrow
fast
evidence-producing
architecture-relevant

Examples:

compile-time module skeleton
thin path through one critical workflow
event/queue boundary proof
fake adapter proving data contract shape
load experiment for one suspected bottleneck

For each spike:

## Validation Spikes
| Spike | Question Answered | Cost | Success Signal | Failure Signal |
|-------|-------------------|------|----------------|----------------|
| ...   | ...               | ...  | ...            | ...            |

Phase 7: Prepare the Handoff to `audit-and-migrate`

Once the user chooses a direction, hand off the decision package cleanly.

The handoff section must include:

Chosen architecture
Decision rationale
Invariants
Non-goals
Critical workflows
External surfaces
Known hotspots
Leading migration risks
Expected deletion zones — code, docs, scripts, config, env, adapters likely to become vestigial
Validation spikes already run
What still needs proof

This section should make it easy to start audit-and-migrate without reopening the architecture debate.

Output Contract

For non-trivial explorations, produce a decision-grade artifact at .claude/architecture/ARCHITECTURE_OPTIONS.md or the project’s equivalent docs location.

Use this structure:

# Architecture Exploration: [System Name]

## Goal

## Problem

## Invariants

## Non-Goals

## Constraints

## External Surfaces

## Current System
| Area | Current Owner | Inputs | Outputs | Dependencies | Pain |

## Option 1: [Name]
### Architecture Shape
### Why It Might Work
### Tradeoffs
### Failure Modes
### Disqualifiers
### Cleanup / Migration Implications
### Unknowns

## Option 2: [Name]
...

## Tradeoff Matrix
| Dimension | Option 1 | Option 2 | Option 3 |

## Assumptions
| Assumption | Why It Matters | How to Verify | Fastest Disproof |

## Risk Register
| Risk | Option(s) Affected | Likelihood | Impact | Mitigation |

## Validation Spikes
| Spike | Question Answered | Cost | Success Signal | Failure Signal |

## Recommendation

## Runner-Up

## Why The Other Options Lose

## Decision Needed

## Handoff to audit-and-migrate

For smaller requests, present the same structure inline in the conversation.

Reasoning Standards

Avoid False Precision

Do not invent numerical scoring that implies rigor you do not actually have. Relative ratings with explanations are better than fake weighted totals.

Prefer Evidence to Taste

Architecture preferences are not evidence. Ground recommendations in:

codebase structure
observed constraints
operational surfaces
migration burden
cleanup burden
failure modes

Always Include a Simpler Option

If every option increases conceptual complexity, you probably missed a better path.

Do Not Default to Rewrite

A full rewrite or major structural reset must earn its place. It is rarely the default winner.

Make the Tradeoffs Symmetrical

Every option gets the same scrutiny. Do not stress-test the user's preferred option less just because it sounds elegant.

When to Stop

Exploration is complete when:

the current system is mapped clearly enough to compare alternatives
the option set includes at least two serious choices
each option has explicit tradeoffs and failure modes
the recommendation is grounded in evidence
the next step is clear: either run a validation spike or start audit-and-migrate

If you reach that point, stop exploring and recommend the decision.

Relationship to `audit-and-migrate`

Use this skill to answer:

"What architecture should we choose?"
"What are our options?"
"What is the best design tradeoff here?"

Use audit-and-migrate after the answer becomes:

"We are committing to Option B"
"Now plan the migration"
"Now land this safely"

architecture-exploration