skills/juliusbrussee/cavekit/complexity-detection

complexity-detection

Installation
SKILL.md

Complexity Detection

A deterministic scoring rubric. Five axes, 0–4 each, summed to 0–20.

Axes

Axis 0 1 2 3 4
Files touched 0–2 3–5 6–10 11–20 20+
Type chore / format refactor feature cross-cutting architectural
Judgment required mechanical low-ambiguity medium high critical (sec/prod)
Cross-component single module two modules three modules many within one repo multi-repo
Novelty known pattern rare pattern novel research needed unknown unknowns

Total score maps to:

Score Depth
0 – 6 quick
7 – 13 standard
14+ thorough

Override signals

Upgrade one step regardless of score when any of these are true:

  • Security-sensitive: authentication, authorization, crypto, secrets, PII.
  • Data migration that is not reversible.
  • Public API shape change (breaking).
  • Performance-critical hot path with an existing SLA.

Downgrade one step only when all of these are true:

  • Zero new dependencies.
  • Existing tests cover the change.
  • No user-visible behaviour change.
  • Single file, single function.

Per-depth defaults

Depth Token budget Model tier Review Tests
quick 8 000 haiku optional smoke
standard 20 000 sonnet required unit + integration
thorough 45 000 sonnet/opus mandatory unit + integration + E2E

These defaults are recorded in .cavekit/config.json under task_budgets and consumed by the cavekit-router.cjs model router.

How agents score

The ck:complexity subagent (haiku) receives a task description and returns a JSON blob:

{
  "score": 11,
  "depth": "standard",
  "axes": {
    "files": 2, "type": 2, "judgment": 2, "cross_component": 2, "novelty": 3
  },
  "overrides_applied": []
}

/ck:map calls this agent per task to set depth in the task registry. If the agent produces a score in the "thorough" band with a novelty of 4 and a security override, it may return needs_research: true, which /ck:map must translate into an upstream ck:researcher task dependency before the work itself.

Integration points

  • /ck:sketch — runs complexity scoring on the whole domain to set the kit's complexity: frontmatter.
  • /ck:map — runs it per task to assign depth.
  • /ck:make — reads depth to size the task budget and pick the review intensity.
  • ck:complexity agent — pure-haiku worker; does nothing else.

Anti-patterns

  • Using one depth for every task in a kit "for consistency." Cost up, signal down.
  • Padding depth to "be safe" — if the budget is oversized, the model wastes tokens exploring. Right-size, then raise only when verification fails.
  • Ignoring overrides — scoring a login flow as "quick" because it touches one file. Security overrides exist for exactly this reason.
Weekly Installs
5
GitHub Stars
707
First Seen
6 days ago