elevate
Elevate
Explore the current project deeply, then answer:
What are the highest-leverage architectural, framework, tooling, or DX improvements you could make to this project right now?
Produce a ranked list of 5–10 opportunities with a scoring matrix. The technical twin of innovate (which asks for product moves).
Focus argument
The skill accepts an optional free-form focus string (e.g., feedback-loop, developer experience, build speed, testing, observability, CI pipeline).
- No focus: survey all four dimensions (architecture, libraries, tooling, DX/infra).
- With focus: interpret the string, map it to relevant dimensions, and return only opportunities that match. Do not include off-focus candidates, even if they look higher-leverage.
Process
-
Read the project. CLAUDE.md, README, key config (package.json, tsconfig, build configs, lockfiles, CI workflows).
-
Discover ADRs and architectural rules. Check in order:
docs/adr/,docs/adrs/,docs/architecture/decisions/,adr/,adrs/.claude/rules/(architectural guidelines stored as rules)- If none found, grep for
ADR-or# Architecture Decisionto catch non-standard layouts. Parse each ADR/rule to extract what it prescribes or forbids.
-
Skim for friction. Recent git history, hot-path code, repeated workarounds, slow builds, awkward patterns, stale deps, TODOs.
-
Generate candidates across four dimensions:
- Architecture — layering, module boundaries, data flow, state shape
- Libraries & frameworks — outdated/deprecated deps, better alternatives, version bumps
- Build & tooling — bundler, compiler, package manager, monorepo structure, CI/CD, type system
- DX & infrastructure — testing strategy, observability, logging, error tracking, feature flags, local dev
-
Apply focus filter if a focus argument was given. Drop non-matching candidates.
-
Check each candidate against ADRs/rules. If a candidate contradicts an ADR, keep it in the matrix but flag it as ADR-blocked and de-rank it (it is not actionable without revisiting the ADR first).
-
Score each surviving candidate on four axes (see below).
-
Surface load-bearing assumptions. For each candidate, list the 1–3 assumptions the Impact and Effort scores depend on. Classify each:
[verified]— the assumption is self-evident from code/config already read (citefile:lineor a specific snippet). Treat as verified.[probe]— specify a concrete, cheap command or measurement the user can run in under ~2 min to confirm. Examples:turbo run <task> --dry=json | jq '...'to tell real work from no-ops,hyperfinefor wallclock,du -sh node_modules/.cachefor cache plausibility,rg -c <pattern>for call-site counts, reading one file the skill hasn't yet read.[unverifiable]— needs infra or prod data the skill can't touch (prod error rate, CI minutes). Keep, but mark.
Do not auto-run probes. The skill suggests; the user decides whether to execute. Any candidate with an unverified
[probe]or[unverifiable]assumption driving an L/XL Impact or S Effort score must be scored Confidence = Low, regardless of how strong the structural signal looks. Keep it in the matrix; don't cap its Impact — let Confidence do the demotion. -
Rank by composite ROI signal. Trim to the 5–10 strongest. Fewer is fine for small/pristine codebases; more is fine for messy ones — let signal decide.
-
Output the matrix and per-opportunity details.
-
Ask which row(s) the user wants expanded into implementation steps. Do not auto-expand.
Scoring scale
Use T-shirt sizes — honest about the estimation involved, no false precision.
| Axis | Scale | Meaning |
|---|---|---|
| Impact | S / M / L / XL | Pain removed or velocity unlocked, measured as user-observable outcome (wallclock, error rate, contributor time saved). Structural proxies alone — DAG edges removed, files touched, lines deleted, tasks in a graph — are not Impact; if only a proxy is available, Confidence caps at Low. |
| Effort | S / M / L / XL | Rough engineering cost (hours → weeks) |
| Risk | Low / Med / High | Blast radius + reversibility |
| Confidence | Low / Med / High | How sure the skill is about Impact and Effort given codebase signals. Any unverified [probe] or [unverifiable] assumption behind an L/XL Impact or S Effort forces this to Low. |
Ranking heuristic: favor high Impact + low Effort + low Risk + high Confidence. ADR-blocked candidates sink to the bottom. Candidates with unverified load-bearing assumptions sink via Confidence=Low, even if their structural signal looks XL — this is intentional, and prevents proxy-driven overestimates from topping the list.
Scope — explicitly refuse to propose
- Pure code cleanup — dead code, renames, file splits belong to
simplify/code-slop/quality-gate - Single-bug fixes — if it's scoped to one bug, it isn't an architectural leap
- Feature additions — product capabilities are
innovate's turf - Speculative rewrites — no "rewrite in Rust" or "migrate to microservices" unless the codebase genuinely demands it; bias toward proven, reversible changes
Output shape
# Elevate — <focus area or "full audit">
ADRs consulted: <list of ADR files / rules found, or "none found">
## Ranking matrix
| # | Opportunity | Impact | Effort | Risk | Confidence | Probes | Notes |
|---|---|---|---|---|---|---|---|
| 1 | <name> | L | S | Low | High | ok | |
| 2 | <name> | XL | M | Med | Low | 2 pending | unverified `[probe]` gates XL |
| … |
| N | <name> | M | S | Low | Low | ok | ⚠️ ADR-blocked (ADR-0012) |
`Probes` column: `ok` when every load-bearing assumption is `[verified]`; otherwise `<N> pending` where N counts outstanding `[probe]` / `[unverifiable]` entries.
## Opportunities
### 1. <Opportunity name>
**Problem:** <real friction visible in the code/stack>
**Change:** <what to do, concretely>
**Impact:** <what this unlocks, in user-observable terms>
**Risk:** <what could go wrong; reversibility>
**Migration path:** <incremental steps, not big-bang>
**Evidence:** <file paths, git signals, hot-paths that motivated this>
**Assumptions & probes:**
- `[verified]` <assumption> — <file:line or reasoning already in context>
- `[probe]` <assumption> — run: `<command>` → expect `<signal>` to confirm
- `[unverifiable]` <assumption> — <why it can't be checked now>
### 2. …
### N. <ADR-blocked opportunity>
**Problem:** <…>
**Change:** <…>
**ADR conflict:** Contradicts `<ADR path / title>` which states "<quoted directive>". Not actionable without revisiting that decision.
**Evidence:** <…>
After the matrix and details, ask:
Which opportunities would you like expanded into implementation steps?
Do not auto-generate implementation steps. Wait for the user to pick.
More from benjaming/ai-skills
confluence-cli
Use confluence-cli (NPM package) to manage Confluence content, pages, and spaces from the command line. Ideal for documentation workflows, bulk content operations, page migration, and when users request CLI-based Confluence interactions. Trigger on requests like "use Confluence CLI", "create Confluence pages via CLI", "migrate Confluence content", "automate documentation workflows", or when users want to script Confluence operations.
43atlassian-cli-jira
Use Atlassian CLI (acli) to manage Jira work items, projects, and workflows from the command line. Ideal for bulk operations, automation, scripting, and when users request CLI-based Jira interactions. Trigger on requests like "use Jira CLI", "create Jira issues via CLI", "bulk update Jira tickets", "automate Jira workflows", or when users want to script Jira operations.
29ralph-loop
Create autonomous iterative loops (Ralph Wiggum pattern) for multi-step tasks. Use when setting up automated workflows that iterate over a backlog of tasks with clear acceptance criteria. Triggers on requests like "create a ralph loop", "set up an iterative agent", "automate this migration", or "create an autonomous loop".
21interview
Interview user to clarify any topic - exploring codebase, investigating issues, planning features, understanding requirements, or drilling into plans. Socratic questioning to uncover details.
20codex-cli
Use OpenAI Codex CLI in non-interactive mode for automated code analysis, review, and programmatic task execution. Trigger on requests like "use Codex to analyze", "run codex exec", "codex code review", or when users want AI-powered code analysis without interactive prompts. Ideal for automation workflows, code quality checks, and generating structured analysis reports.
19daily-standup
Daily standup assistant for Benjamin that compiles work priorities from Jira and Slack into a single prioritized task list. This skill should be used when Benjamin asks for morning standup, daily priorities, what to work on today, or needs to compile work items.
18