<tool_restrictions>

MANDATORY Tool Restrictions

BANNED TOOLS — calling these is a skill violation:

EnterPlanMode — BANNED. Do NOT call this tool. This skill has its own structured process.
ExitPlanMode — BANNED. You are never in plan mode. </tool_restrictions>

<arc_runtime> This workflow requires the full Arc bundle, not a prompts-only install. Resolve the Arc install root from this skill's location and refer to it as ${ARC_ROOT}. Use ${ARC_ROOT}/... for Arc-owned files. Use project-local paths such as .ruler/ or rules/ for the user's repository. </arc_runtime>

<required_reading> Before starting, read these references:

${ARC_ROOT}/references/architecture-patterns.md — import depth rules, boundary violations
${ARC_ROOT}/references/component-design.md — compound vs simple component patterns </required_reading>

Architectural Refactoring

Discover structural friction, propose deep-module refactors, and create RFC issues.

Core Concept: Deep Modules

From John Ousterhout's A Philosophy of Software Design:

A deep module has a small interface hiding a large implementation. Deep modules are:

More testable (test at the boundary, not inside)
More navigable (fewer files to understand a concept)
More maintainable (changes stay internal)

A shallow module has an interface nearly as complex as its implementation. Shallow modules:

Force callers to understand implementation details
Create coupling between files that should be independent
Make testing harder (you test internals, not behaviour)

Process

Step 1 — Explore for friction

Use the Agent tool with subagent_type=Explore to navigate the codebase. If the user provided a path or focus area, start there. Otherwise, explore broadly.

Do NOT follow rigid heuristics. Explore organically and note where you experience friction:

Where does understanding one concept require bouncing between many small files?
Where are modules so shallow that the interface is nearly as complex as the implementation?
Where have pure functions been extracted just for testability, but the real bugs hide in how they're called?
Where do tightly-coupled modules create integration risk in the seams between them?
Where are there deep relative imports (5+ levels) indicating boundary violations?
Which parts of the codebase are untested, or hard to test?
Where do barrel files re-export everything, hiding the real dependency graph?

The friction you encounter IS the signal.

Step 2 — Present candidates

Present a numbered list of deepening opportunities. For each candidate:

Field	Description
Cluster	Which modules/concepts are involved
Why they're coupled	Shared types, call patterns, co-ownership of a concept
Dependency category	See categories below
Import depth	Max relative import depth between coupled modules
Test impact	What existing tests would be replaced by boundary tests
Severity	How much this coupling costs day-to-day

Ask the user: "Which of these would you like to explore?"

Step 3 — Frame the problem space

Before spawning design agents, write a user-facing explanation of the chosen candidate:

The constraints any new interface would need to satisfy
The dependencies it would need to rely on
A rough illustrative code sketch to make the constraints concrete — this is NOT a proposal, just grounding

Show this to the user, then immediately proceed to Step 4.

Step 4 — Design competing interfaces

Spawn 3+ sub-agents in parallel using the Agent tool. Each must produce a radically different interface for the deepened module.

Give each agent a technical brief (file paths, coupling details, dependency category, what's being hidden) plus a different design constraint:

Agent	Constraint
Agent 1	"Minimise the interface — aim for 1-3 entry points max"
Agent 2	"Maximise flexibility — support many use cases and extension"
Agent 3	"Optimise for the most common caller — make the default case trivial"
Agent 4 (if applicable)	"Design around ports & adapters for cross-boundary dependencies"

Each sub-agent outputs:

Interface signature — types, methods, params
Usage example — how callers use it
What complexity it hides — what's internal
Dependency strategy — how deps are handled (see categories below)
Trade-offs — what you gain and what you lose

Present all designs, then compare them in prose. Give your own recommendation — which design is strongest and why. If elements from different designs combine well, propose a hybrid. Be opinionated.

Step 5 — User picks an interface

Step 6 — Create RFC issue

Create a refactor RFC as a GitHub issue using gh issue create:

## Problem

[Describe the architectural friction — which modules are shallow and coupled,
what integration risk exists, why this makes the codebase harder to navigate]

## Proposed Interface

[The chosen interface design — signature, usage example, what it hides]

## Dependency Strategy

[Which category applies and how dependencies are handled]

## Testing Strategy

- **New boundary tests to write**: [behaviours to verify at the interface]
- **Old tests to delete**: [shallow module tests that become redundant]
- **Test environment needs**: [local stand-ins or adapters required]

## Implementation Recommendations

[Durable guidance NOT coupled to current file paths:
- What the module should own (responsibilities)
- What it should hide (implementation details)
- What it should expose (the interface contract)
- How callers should migrate]

Do NOT ask the user to review before creating — just create it and share the URL.

Dependency Categories

When assessing a candidate, classify its dependencies:

1. In-process

Pure computation, in-memory state, no I/O. Always deepenable — merge the modules and test directly.

2. Local-substitutable

Dependencies with local test stand-ins (PGLite for Postgres, in-memory filesystem). Deepenable if the stand-in exists. Test with the local stand-in running in the test suite.

3. Remote but owned (Ports & Adapters)

Your own services across a network boundary. Define a port (interface) at the module boundary. The deep module owns the logic; the transport is injected. Tests use an in-memory adapter.

4. True external (Mock)

Third-party services (Stripe, Twilio) you don't control. Mock at the boundary. The deepened module takes the external dependency as an injected port; tests provide a mock.

Testing Strategy

The core principle: replace, don't layer.

Old unit tests on shallow modules are waste once boundary tests exist — delete them
Write new tests at the deepened module's interface boundary
Tests assert on observable outcomes through the public interface, not internal state
Tests should survive internal refactors — they describe behaviour, not implementation

Signals That Indicate Deepening Opportunities

From the architecture patterns reference:

Signal	What it means
5+ levels of `../` imports	Code is reaching across boundaries
Barrel file re-exporting everything	Hiding the real dependency graph
Test file longer than source file	Testing internals, not behaviour
"Utils" folder with 20+ files	Shallow modules masquerading as shared code
Type file imported by 10+ modules	Hidden coupling through shared types
Feature spread across 8+ files	Over-decomposition, shallow modules
Mock setup longer than test body	Integration seams are in the wrong place

refactor