architecture-audit
Architecture audit
Surface architectural friction in a codebase and propose deepening opportunities — refactors that turn shallow modules into deep ones by relocating their seams. The aim is testability, locality, and AI/human navigability.
When to use
- The user asked to audit a codebase, find refactor opportunities, consolidate tightly-coupled modules, or identify shallow abstractions.
- A codebase has grown organically and now feels harder to navigate or change than it should.
- Pure functions were extracted for testability, but bugs still hide in how they're called — the unit tests pass and the real behaviour breaks.
- A test suite has 80% coverage but tells you nothing about whether the real flow works.
- The user wants to know where modules leak across seams that aren't real seams.
Skip this skill for: pure code review of a single PR (use qa or a code-review pattern), greenfield architecture design (use deep-discovery or information-architecture), or formatting/linting (use pre-commit-setup).
The vocabulary is the point
Use the terms from references/language.md exactly. Do not substitute "component", "service", "API", "boundary". Consistent vocabulary is what makes audits comparable across modules and across sessions.
The seven core terms: Module, Interface, Implementation, Depth, Seam, Adapter, Leverage, Locality. The three core principles: the deletion test, the interface is the test surface, and one adapter = hypothetical seam, two = real.
Workflow
- Read the optional context. If the project has a domain glossary (commonly
CONTEXT.md,GLOSSARY.md, or similar) and/or architectural decision records (docs/adr/,architecture/decisions/), read them first. These name the good seams and record decisions that the audit should not re-litigate. Loadreferences/optional-context.mdfor the heuristic. If no such files exist, continue without them — do not stop the audit. - Explore the codebase. Walk it with whatever exploration mechanism the harness offers (sub-agent / search tools / direct reads). Don't follow a rigid checklist — feel for friction. Load
references/exploration.mdfor the friction signals to look for. - Apply the deletion test. For every candidate shallow module, imagine deleting it. If complexity vanishes, it was a pass-through. If complexity reappears across N callers, it was earning its keep. Load
references/language.mdfor the full principle. - Present a numbered list of candidates. Each candidate ships with: which files / modules are involved, what friction the current architecture is causing, a plain-English description of what would change, and the benefits in terms of locality, leverage, and how tests would improve. Use the project's domain vocabulary for the what and
language.mdvocabulary for the architecture shape. Seeassets/templates/deepening-candidate.mdfor the format. Do NOT propose interfaces yet. Ask: "Which of these would you like to explore?" - Walk the design tree on the picked candidate. Drop into an interactive grilling conversation. Constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive. (
plan-interviewskill is the natural complement here.) Loadreferences/deepening.mdfor dependency categories and testing strategy. - Optionally: design the interface twice (or three or four times). When the user wants to compare alternatives for the deepened module, fan out to parallel sub-agents — each producing a radically different interface. Load
references/interface-design.md. - Update the project's documentation as decisions crystallize. If the deepened module is named after a concept missing from the domain glossary, add it. If the user rejects a candidate with a load-bearing reason, offer to record an ADR so future audits don't re-suggest it. See
references/optional-context.md.
Available resources
references/language.md— the seven-term vocabulary (Module, Interface, Implementation, Depth, Seam, Adapter, Leverage, Locality), the three core principles (deletion test, interface-as-test-surface, one-adapter-rule), relationships between the terms, and the framings this skill rejects.references/exploration.md— friction signals to look for when walking a codebase: shallow modules, scattered concepts, leak across seams, untestable interfaces, pure functions extracted-for-testability with no locality.references/deepening.md— four dependency categories (in-process, local-substitutable, remote-but-owned, true-external), how each gets tested across the seam, seam discipline, and the "replace, don't layer" testing strategy.references/interface-design.md— the parallel sub-agent fanout pattern for "Design It Twice" interface exploration: how to frame the problem space, how to brief each sub-agent, how to compare and synthesize.references/optional-context.md— how to use existing domain glossaries (CONTEXT.md, etc.) and ADRs if they exist, when to add new terms during a grilling session, when to offer to record an ADR after a rejection.assets/templates/deepening-candidate.md— copy-paste format for each candidate in the numbered list (Files / Problem / Solution / Benefits / Tests).assets/examples/full-example.md— full worked audit on a sample codebase: explore → present candidates → grilling → interface fanout → resolved refactor.
Top gotchas (always inline — do not skip)
- Vocabulary discipline matters. Don't drift into "component", "service", "API", "boundary" — those are vague-by-default and invite re-litigation. Use Module / Interface / Seam / Adapter consistently.
- The deletion test is the load-bearing check. "This module looks shallow" is a hunch. "If I deleted this module, complexity would reappear at the call sites" is the test. Run the test before proposing anything.
- Interface is more than the type signature. It includes invariants, ordering constraints, error modes, required configuration, performance characteristics. A "small" interface that hides a 12-step ordering requirement is not actually small.
- One adapter = hypothetical seam. If your proposal introduces a port and only one thing implements it, you've added indirection, not a seam. Real seams have at least two adapters (typically production + test).
- Don't expose internal seams through the external interface. A deep module can have internal seams (private to its implementation, used by its own tests). Tests are not callers — internal-seam-for-tests is fine, internal-seam-leaking-into-the-public-interface is not.
- The interface is the test surface. If you want to test past the interface, the module is probably the wrong shape. Don't propose deepening that requires reaching past the seam to verify behaviour.
- Replace tests, don't layer them. When deepening, the old unit tests on the shallow modules become waste. Delete them. Write new tests at the deepened interface. Otherwise you have layers of redundant tests that all change when implementation changes.
- Don't propose interfaces in step 4. The candidate list is a menu, not a design. Interface design happens after the user picks. Proposing interfaces too early collapses the menu into one branch.
- Domain vocabulary for the what; architecture vocabulary for the shape. "The Order intake module" — not "FooBarHandler" (no domain), and not "the Order service" (the term
serviceis on the avoid list). The deliverable should sound like both the project's glossary and this skill's vocabulary. - ADR conflicts: only surface when the friction is real. If a candidate contradicts an existing ADR, only mention it when the current pain is real enough to warrant reopening the decision. Don't enumerate every theoretical refactor an ADR forbids.
- Pure-function extraction is often an anti-pattern in disguise. "We extracted the calculation into a pure function so it's testable" sounds good, but if the bugs live in how callers compose the calculations, the unit tests on the pure function don't catch them. Look for this pattern explicitly during exploration.
- Don't re-suggest something the user already rejected. If a previous session rejected a candidate, the rejection should be in an ADR. Read those before exploring.
What you DO
- Read the project's domain glossary and ADRs first if they exist; carry that vocabulary into the audit.
- Explore organically — feel for friction, don't march a checklist.
- Apply the deletion test to every shallow-looking module before proposing.
- Use
language.mdvocabulary exactly. Use the project's domain vocabulary for what the module is. - Present a numbered list of candidates with Files / Problem / Solution / Benefits / Tests.
- Wait for the user to pick before proposing any interface.
- Walk the picked candidate's design tree interactively, one decision at a time.
- Use parallel sub-agent fanout for interface design when comparison would help.
- Recommend "replace, don't layer" testing — delete old shallow-module tests when the deepened tests exist.
- Update the project's domain glossary inline when new concepts emerge during grilling.
- Offer to record an ADR when the user rejects a candidate with a load-bearing reason.
What you do NOT do
- Do not use "component", "service", "API", "boundary" when you mean Module, Interface, Seam.
- Do not propose interfaces in the candidate list — wait for the user to pick.
- Do not propose a port-and-adapter when only one adapter exists.
- Do not propose tests that reach past the interface.
- Do not layer new tests on top of old shallow-module tests — replace them.
- Do not list every refactor a passing ADR would forbid; only surface when the pain warrants reopening.
- Do not march the audit through the codebase in path order; sequence by friction signal.
- Do not invent depth-as-line-count metrics ("long implementation behind short signature" is the popular but wrong framing — use depth-as-leverage instead).
- Do not skip the deletion test because the module "looks" shallow. Run it.
- Do not write code during the audit — the deliverable is a refactor proposal, not a partial implementation.
- Do not invoke this skill for greenfield work — it audits existing code; for new work use
deep-discoveryorinformation-architecture.
More from mkabumattar/skills
linux-script-developer
Write production-ready Bash scripts with strict error handling (`set -euo pipefail`), validated argument parsing, colored user feedback, and cross-platform compatibility (Linux, macOS, Windows via Git Bash/WSL). Use this skill whenever the user asks for a `.sh` script, a shell script, a Bash one-liner installer, a deployment script, an automation/CI script, a CLI wrapper, or a file-batch processor — including casual phrasings like "write a script to ...", "automate this in bash", or "make me a shell tool". Also use when reviewing or hardening an existing Bash script.
16skill-builder
Build a new Agent Skill that follows the agentskills.io specification and best practices — slim `SKILL.md` (≤ 200 lines / 5K tokens), valid kebab-case `name`, imperative `description` under 1024 chars, progressive disclosure via `references/`, bundled `assets/` and `scripts/`, and an MIT `LICENSE`. Use this skill whenever the user asks to create, scaffold, build, write, or author a new Agent Skill — including phrasings like "build a skill for X", "scaffold a new skill", "create an agent skill", "make me a skill that does X", "write a SKILL.md for ...", or "I want to publish a skill on agentskills.io". Also use when reviewing or refactoring an existing oversized `SKILL.md` (a sign that detail should be moved to `references/`).
15python-script-developer
Write production-ready Python CLI tools, automation scripts, and batch file processors with type hints, structured `logging` (never `print` for diagnostics), `argparse` interfaces, `pathlib` for filesystem work, specific exception handling, and cross-platform support (Linux, macOS, Windows). Use this skill whenever the user asks to create a Python script, `.py` utility, CLI tool, automation, batch processor, or data pipeline — including casual phrasings like "write a python script that ...", "automate this in python", "I need a small tool", or "give me a one-off processor". Also use when reviewing or hardening an existing Python script.
15information-architecture
Plan the structural and execution architecture of a feature, app, or site — produce both an `INFORMATION_ARCHITECTURE.md` (site map, navigation, content hierarchy, user flows, URL strategy, naming conventions, component reuse map) AND a phased `PLAN.md` (phases by impact/effort/risk, vertical-slice tasks with sub-tasks, dependencies, estimates, and a detailed task breakdown with Why/How/Impact/Effort). Use this skill whenever the user wants to plan a product or feature, design site structure, lay out information architecture, map user flows, organize content, break work into phases, build a roadmap, plan an implementation order, or hits you with phrases like "plan the IA", "map the structure", "break this into tasks", "give me a roadmap", "phase out the work", "create an enhancement plan", or "what should I build first". Also use when reviewing or refactoring an existing IA or project plan.
15makefile-script-developer
Write production-ready GNU Makefiles with strict shell mode (`SHELL := /bin/bash` + `.SHELLFLAGS := -euo pipefail -c`), validated multi-environment configuration via `$(filter ...)`, pre-flight check targets (`check-tools`, `check-env`), structured logging with timestamps, confirmation gates for destructive ops, layered `.env` includes, platform detection, and self-documenting help. Use this skill whenever the user asks to write a Makefile, harden an existing one, add a target, build a deploy/release pipeline, automate Terraform/Helm/Kubernetes/Docker/build workflows, or expose tasks as `make <target>` — including casual phrasings like "write a Makefile", "add a make target", "automate this in make", "give me a build pipeline", or "clean up the Makefile". Also use when reviewing an existing Makefile for safety, error handling, or organization issues.
15qa
Run an interactive QA session — the user describes bugs and issues conversationally, you ask brief clarifying questions, explore the codebase for domain context, decide whether to file one issue or break it down, and create durable user-focused GitHub issues via `gh issue create` — without referencing internal file paths or line numbers. Use this skill whenever the user wants to do QA, report bugs, file issues, walk through a list of problems, or hits you with phrases like "let's do a QA session", "I found a bug", "this is broken", "file this as an issue", "I have a few things to report", or "let's go through these one by one". Also use when the user is reviewing a deployed feature and wants to track defects.
14