Domain Research Skill

Operator Context

This skill operates as an operator for domain decomposition, configuring Claude's behavior for discovering, classifying, and mapping subdomains within a target domain before pipeline generation begins. It implements a Research-Classify-Map-Produce pattern — broad parallel discovery narrows into structured classification, then maps to pipeline chains, then produces a Component Manifest.

This skill is the first step in the self-improving pipeline generator (see adr/self-improving-pipeline-generator.md). It answers: "What subdomains exist in this domain, and what kind of pipeline does each one need?"

Hardcoded Behaviors (Always Apply)

CLAUDE.md Compliance: Read and follow repository CLAUDE.md files before execution. Project instructions override default skill behaviors.
Parallel Research Enforcement (Rule 12): Phase 1 MUST dispatch 4 parallel research agents. Sequential research is BANNED. WHY: A/B testing proved parallel research eliminates a 1.40-point gap in Examples quality (adr/pipeline-creator-ab-test.md). Sequential grep-based research produces shallower, less diverse findings.
Dual-Layer Artifacts: Every phase produces both manifest.json (machine envelope) and content.md (human-readable output). WHY: The Pipeline Architect needs machine-readable metadata to validate chain composition, while agents need readable content for reasoning. See ADR "Artifact Format: Dual-Layer Output Standard".
Discovery Over Invention: The skill discovers subdomains through research — it does NOT hardcode or assume subdomain lists. WHY: Hardcoded lists miss domain-specific nuances and become stale. The whole point of this skill is that it adapts to any domain.
Reuse Before Create (Rule 9): When classifying subdomains, always check if existing agents/skills cover 70%+ of the subdomain before marking it as "needs new component". WHY: Agents are expensive context. Skills are cheap. The generator biases toward binding new skills to existing agents.

Default Behaviors (ON unless disabled)

Communication Style: Report findings without self-congratulation. Show the subdomain list and classifications directly rather than describing the process.
Temporary File Cleanup: Phase artifacts live in /tmp/pipeline-{run-id}/. The PRODUCE phase copies final artifacts to permanent location. Intermediate files remain for debugging until the pipeline-orchestrator-engineer cleans up.
Operator Profile Detection: Read the detected operator profile from pipeline context but do NOT gate any research steps on it. WHY: Research itself is read-only and harmless across all profiles. The profile information is passed through to the Component Manifest so downstream skills (chain-composer, scaffolder) can apply the correct safety gates.

Optional Behaviors (OFF unless enabled)

Deep Reference Research: Agent 4 (Reference Research) fetches external documentation URLs. OFF by default because it requires network access and increases latency.
Verbose Classification: Show detailed rationale for every task type assignment. ON for debugging classification disagreements.

What This Skill CAN Do

Discover subdomains within any target domain by dispatching parallel research agents
Scan the existing repository for agents, skills, hooks, and scripts that overlap with the target domain
Classify each subdomain by task type, complexity, and reuse potential
Map each subdomain to a preliminary pipeline chain from the step menu
Produce a Component Manifest listing everything the scaffolder needs to build
Detect which existing agents can be reused as executors for new subdomain skills

What This Skill CANNOT Do

Scaffold pipeline components: That is handled by pipeline-scaffolder after this skill completes
Compose final pipeline chains: Preliminary chains are draft proposals; the chain-composer skill finalizes them with type compatibility validation
Create or modify routing entries: That is handled by routing-table-updater
Validate pipeline chains against the type compatibility matrix: This skill maps chains; validation is the chain-composer's responsibility

Instructions

Inputs

Input	Source	Required
Domain name	User prompt (e.g., "Prometheus", "RabbitMQ", "Terraform")	Yes
Run ID	Generated by pipeline-orchestrator-engineer or `uuidgen`	Yes
Operator profile	Detected by `pipeline-context-detector` hook or pipeline-orchestrator Phase 0	No (default: Personal)
Environmental state JSON	From `pipeline-context-detector` hook	No (Phase 1 can scan directly)

Phase 1: DISCOVER (Parallel Multi-Agent — Rule 12 Mandatory)

Goal: Build a broad, multi-perspective understanding of the target domain. Breadth of research directly determines the quality of subdomain discovery — this is why parallel agents are mandatory, not optional.

Default N = 4 agents. Override with --research-agents N (minimum 2, maximum 6).

Step 1: Prepare research context

Create the run directory and shared context block:

mkdir -p /tmp/pipeline-{run-id}/phase-1-research

Assemble shared context from:

The domain name and any user-provided context about what they need
The ADR from pipeline-orchestrator Phase 0 (if it exists)
Environmental state JSON (if available from pipeline-context-detector)

Step 2: Dispatch 4 parallel research agents

Launch all 4 agents simultaneously using the Task tool. Each agent receives the shared context block and saves findings to a separate artifact file. Each has a 5-minute timeout.

Agent 1: Domain Expert

Investigate: What are the subdomains of this domain? What tasks do practitioners commonly perform? What are the natural workflow boundaries?
Method: Use domain knowledge, web search if available, and reasoning about the domain's structure
Output: List of candidate subdomains with descriptions and common tasks for each
Save to: /tmp/pipeline-{run-id}/phase-1-research/agent-1-domain-expert.md

Agent 2: Existing Inventory

Investigate: What agents, skills, hooks, and scripts already exist in this repository for this domain or closely related domains?
Method: Search agents/INDEX.json for matching agent names and triggers. Search skills/do/references/routing-tables.md for matching skill routes. Glob for file patterns matching the domain name. Grep agent descriptions for domain keywords.
Output: Inventory of existing components with relevance assessment (full match, partial match, tangential)
Save to: /tmp/pipeline-{run-id}/phase-1-research/agent-2-existing-inventory.md

Agent 3: Workflow Patterns

Investigate: For each candidate subdomain (or domain-level if subdomains aren't yet known), what pipeline patterns from the step menu fit?
Method: Read skills/pipeline-scaffolder/references/step-menu.md. For each step family, assess whether the domain's tasks match the "When To Use" criteria. Identify which step families are most relevant.
Output: Mapping of domain task patterns to step menu families, with reasoning
Save to: /tmp/pipeline-{run-id}/phase-1-research/agent-3-workflow-patterns.md

Agent 4: Reference Research

Investigate: What reference documentation would subdomain skills need? What external specs, APIs, or tools define this domain? What domain-specific validators exist or could be built?
Method: Identify key domain concepts that would need reference files. List external documentation sources. Identify domain artifacts that could be validated deterministically (e.g., PromQL syntax, HCL validation, YAML schema).
Output: Reference file recommendations and deterministic validation opportunities
Save to: /tmp/pipeline-{run-id}/phase-1-research/agent-4-reference-research.md

Step 3: Collect and merge research artifacts

After all agents complete (at least 3 of 4 must succeed), merge findings into a single research compilation at /tmp/pipeline-{run-id}/phase-1-research/content.md.

Create the Phase 1 dual-layer artifact:

/tmp/pipeline-{run-id}/phase-1-research/manifest.json:

{
  "schema": "research-artifact",
  "step": "RESEARCH",
  "phase": 1,
  "status": "complete",
  "metrics": {
    "agents_dispatched": 4,
    "agents_completed": 4,
    "candidate_subdomains": 0,
    "existing_components_found": 0
  },
  "inputs": ["adr/pipeline-{name}.md"],
  "outputs": ["content.md", "agent-1-domain-expert.md", "agent-2-existing-inventory.md", "agent-3-workflow-patterns.md", "agent-4-reference-research.md"],
  "timestamp": "",
  "tags": ["{domain-name}", "research", "domain-decomposition"]
}

Update metrics.candidate_subdomains and metrics.existing_components_found with actual counts.

GATE: All of the following must be true before proceeding:

At least 3 of 4 research agents completed successfully
Research compilation file exists at /tmp/pipeline-{run-id}/phase-1-research/content.md
At least 1 candidate subdomain identified
manifest.json written with status: "complete"

If fewer than 3 agents completed: set status: "partial", report which agents failed and why, and proceed with available data only if at least 1 subdomain was identified. Otherwise STOP and report failure to pipeline-orchestrator.

Phase 2: CLASSIFY

Goal: For each candidate subdomain from Phase 1, assign a task type, complexity tier, reuse assessment, and required references. This phase transforms raw research findings into structured classifications that Phase 3 can map to pipeline chains.

Step 1: Load classification reference

Read skills/domain-research/references/task-type-guide.md for task type definitions and canonical chain patterns. This file defines the 8 task types and provides examples of how each maps to step menu families.

Step 2: Classify each subdomain

For each candidate subdomain discovered in Phase 1, determine:

Task Type (exactly one primary, optionally one secondary):

Task Type	Description	Key Indicator
`generation`	Produces new artifacts (code, config, docs)	"Write me a...", "Create a...", "Generate..."
`review`	Evaluates existing artifacts against criteria	"Review this...", "Check if...", "Audit..."
`debugging`	Diagnoses and resolves failures	"Why is X broken?", "Fix this...", "Troubleshoot..."
`operations`	Manages running systems (deploy, scale, monitor)	"Deploy...", "Scale...", "Restart...", "Monitor..."
`configuration`	Produces or validates configuration artifacts	"Configure...", "Set up...", "Add rule for..."
`analysis`	Investigates data/systems to produce insights	"Analyze...", "What's the impact of...", "Compare..."
`migration`	Transforms between versions, formats, or systems	"Migrate from...", "Upgrade...", "Convert..."
`monitoring`	Ongoing observation with threshold-based alerting	"Alert when...", "Watch for...", "Track..."

When a subdomain's tasks span two types (e.g., alerting is configuration + monitoring), assign the primary type based on the most common practitioner workflow and note the secondary type.

Complexity (based on expected pipeline chain length):

Complexity	Chain Length	Characteristics
Simple	3-5 steps	Linear flow, single output type, no cross-domain dependencies
Medium	5-8 steps	May include review or validation loops, 1-2 output types
Complex	8+ steps	Cross-domain delegation, safety gates, multiple output types, experimentation

Reuse Potential — For each subdomain, check the existing inventory (from Agent 2):

Full reuse (80%+): Existing agent AND skill cover this subdomain. Skip it or note for enhancement only.
Partial reuse (40-79%): Existing agent covers the domain but no skill for this specific subdomain. Create new skill, bind to existing agent.
New (<40%): No existing components meaningfully cover this. Needs new skill and possibly new agent.

Required References — What domain knowledge files would the subdomain skill need?

Pattern libraries (e.g., promql-patterns.md, terraform-module-patterns.md)
Configuration schemas (e.g., alertmanager-routing.md, helm-values-schema.md)
Best practices guides (e.g., metric-naming-conventions.md)

Required Scripts — What deterministic validators would the subdomain skill need?

Syntax validators (e.g., promql-validator.py, hcl-lint.py)
Schema validators (e.g., yaml-schema-check.py)
Convention checkers (e.g., naming-convention-check.py)

Step 3: Produce classification artifact

Create the Phase 2 dual-layer artifact:

/tmp/pipeline-{run-id}/phase-2-classify/manifest.json:

{
  "schema": "structured-corpus",
  "step": "CLASSIFY",
  "phase": 2,
  "status": "complete",
  "metrics": {
    "subdomains_classified": 0,
    "full_reuse": 0,
    "partial_reuse": 0,
    "new_components": 0
  },
  "inputs": ["../phase-1-research/content.md"],
  "outputs": ["content.md"],
  "timestamp": "",
  "tags": ["{domain-name}", "classification", "subdomain"]
}

/tmp/pipeline-{run-id}/phase-2-classify/content.md — Structured classification table:

# Subdomain Classification: {Domain}

## Classification Summary

| Subdomain | Task Type | Secondary | Complexity | Reuse | Existing Agent |
|-----------|-----------|-----------|------------|-------|----------------|
| {name} | {type} | {type|none} | {Simple|Medium|Complex} | {Full|Partial|New} | {agent-name|none} |

## Detailed Classifications

### {Subdomain Name}
- **Task Type**: {primary} (secondary: {secondary|none})
- **Complexity**: {tier} — {rationale}
- **Reuse**: {assessment} — {which existing components and what % coverage}
- **Required References**: {list of reference files needed}
- **Required Scripts**: {list of deterministic validators needed}
- **Key Tasks**: {what practitioners do in this subdomain}

GATE: All of the following must be true before proceeding:

Every candidate subdomain from Phase 1 has a task_type assigned
Every candidate subdomain has a complexity tier
Every candidate subdomain has a reuse assessment
At least 2 subdomains classified (if only 1 found, reconsider whether the domain is too narrow for decomposition — report to pipeline-orchestrator and ask whether to proceed as single-pipeline)
manifest.json written with status: "complete"

Phase 3: MAP (Compose Preliminary Chains)

Goal: For each classified subdomain, select steps from the step menu and compose a preliminary pipeline chain. These are draft chains — the chain-composer skill validates and finalizes them.

Step 1: Load step menu

Read skills/pipeline-scaffolder/references/step-menu.md for the complete step inventory with output schemas and type compatibility.

Step 2: Compose chains

For each classified subdomain, build a preliminary chain by:

Start with ADR (mandatory — every chain starts with ADR per composition rules)
Select research steps based on what the subdomain needs to know before generating output:
- Domain needing broad investigation: GATHER or RESEARCH (parallel)
- Domain needing codebase analysis: SCAN or SEARCH
- Domain needing external data: FETCH
Select structuring steps to organize research into usable form:
- Most generation tasks: COMPILE (structure findings) or OUTLINE (define output shape)
- Decision-heavy tasks: ASSESS or BRAINSTORM
- Architecture tasks: MAP
Select generation/execution steps based on task type:
- generation: GENERATE (possibly preceded by GROUND for audience-specific content)
- review: REVIEW (parallel 3+ lenses) then AGGREGATE
- debugging: SEARCH then PLAN then EXECUTE
- operations: PROBE then PLAN then EXECUTE
- configuration: GENERATE then CONFORM
- analysis: RESEARCH then SYNTHESIZE
- migration: CHARACTERIZE then PLAN then TRANSFORM then EXECUTE
- monitoring: GENERATE then MONITOR (or PROBE for health checks)
Select validation steps based on whether output is deterministically checkable:
- Has a syntax/schema: LINT or CONFORM
- Has testable behavior: VERIFY
- Has quality criteria: VALIDATE
- Add REFINE (max 3 cycles) after any validation step that can fail
Apply profile gates — note which steps are profile-dependent:
- APPROVE: Work/Production only
- GUARD + SNAPSHOT: Work/Production only for state changes
- SIMULATE: Production only (optional elsewhere)
- NOTIFY: CI/Work/Production (skip in Personal)
- Record these as annotations on the chain, not hard inclusions
Check for cross-domain dependencies:
- Does this subdomain need expertise from another domain? Add DELEGATE
- Does this subdomain's output feed into another subdomain's input? Note the dependency

Step 3: Validate chains against type compatibility

For each preliminary chain, verify adjacent steps have compatible output-to-input types using the Type Compatibility Matrix from the ADR:

Step A produces → Schema X
Step B consumes → [list of acceptable schemas]
If Schema X is in Step B's consumes list → compatible
Otherwise → flag the incompatibility and adjust

Step 4: Produce mapping artifact

/tmp/pipeline-{run-id}/phase-3-map/manifest.json:

{
  "schema": "decision-record",
  "step": "MAP",
  "phase": 3,
  "status": "complete",
  "metrics": {
    "chains_composed": 0,
    "profile_gated_steps": 0,
    "cross_domain_delegates": 0,
    "type_incompatibilities_resolved": 0
  },
  "inputs": ["../phase-2-classify/content.md"],
  "outputs": ["content.md"],
  "timestamp": "",
  "tags": ["{domain-name}", "chain-composition", "step-menu"]
}

/tmp/pipeline-{run-id}/phase-3-map/content.md:

# Preliminary Pipeline Chains: {Domain}

## Chain Summary

| Subdomain | Chain | Profile Gates |
|-----------|-------|---------------|
| {name} | ADR -> STEP -> STEP -> ... -> OUTPUT | {list of profile-dependent steps} |

## Detailed Chains

### {Subdomain Name}
**Task type**: {type} | **Complexity**: {tier}
**Chain**: `ADR -> {step} -> {step} -> ... -> OUTPUT`

| Step | Purpose | Output Schema | Profile Gate |
|------|---------|---------------|-------------|
| ADR | Persistent reference | - | None |
| {step} | {why this step} | {schema} | {None|Work|Production} |

**Cross-domain dependencies**: {none | DELEGATE to {domain} for {reason}}
**Type compatibility**: All adjacent steps validated / {list any resolved incompatibilities}

GATE: All of the following must be true before proceeding:

Every classified subdomain has a preliminary chain
Every chain starts with ADR and ends with OUTPUT (or REPORT for analysis tasks)
No unresolved type compatibility issues remain
All profile-gated steps are annotated (not hard-included)
manifest.json written with status: "complete"

Phase 4: PRODUCE (Component Manifest)

Goal: Compile all findings into the final Component Manifest — the single document that tells pipeline-scaffolder and chain-composer exactly what to build.

Step 1: Determine agent strategy

Based on the existing inventory (Phase 1, Agent 2) and reuse assessments (Phase 2):

If an existing agent covers 70%+ of the domain: Reuse it. Bind all new subdomain skills to this agent. Note the agent name and what gaps it has (if any).
If no existing agent covers the domain: Create one new coordinator agent. Define its name ({domain}-pipeline-engineer or {domain}-{function}-engineer), purpose, and which subdomain skills it will execute.
NEVER create one agent per subdomain. WHY: Agents are expensive context; skills are cheap. The architecture is "1 agent : N skills" not "N agents : N skills".

Step 2: Compile shared resources

Identify resources that span multiple subdomains:

Shared references: Domain knowledge files useful across multiple subdomain skills (e.g., {domain}-domain-knowledge.md)
Shared scripts: Validators that multiple subdomain skills invoke (e.g., {domain}-syntax-validator.py)
Shared hooks: Detectors that trigger multiple subdomain skills based on context

Step 3: Write the Component Manifest

/tmp/pipeline-{run-id}/phase-4-produce/manifest.json:

{
  "schema": "orchestration-manifest",
  "step": "PRODUCE",
  "phase": 4,
  "status": "complete",
  "metrics": {
    "subdomains": 0,
    "new_skills": 0,
    "new_agents": 0,
    "reused_agents": 0,
    "reference_files": 0,
    "scripts": 0
  },
  "inputs": [
    "../phase-1-research/content.md",
    "../phase-2-classify/content.md",
    "../phase-3-map/content.md"
  ],
  "outputs": ["content.md"],
  "timestamp": "",
  "tags": ["{domain-name}", "component-manifest", "pipeline-generation"]
}

/tmp/pipeline-{run-id}/phase-4-produce/content.md:

# Component Manifest: {Domain}

## Domain
- **Name**: {domain}
- **Description**: {1-2 sentence domain summary}
- **Operator Profile**: {detected profile}
- **Subdomains Discovered**: {count}

## Agent Strategy

### Reused Agent
- **Name**: {agent-name} (from `agents/{agent-name}.md`)
- **Coverage**: {what it covers, what gaps remain}
- **Binding**: All subdomain skills below bind to this agent

### New Agent (if needed)
- **Name**: {domain}-{function}-engineer
- **Purpose**: {single-purpose description}
- **Pairs With**: [{skill-1}, {skill-2}, ...]

## Subdomain Skills

### {Subdomain 1}: {name}
- **Skill name**: {domain}-{subdomain}
- **Task type**: {primary} (secondary: {secondary|none})
- **Complexity**: {tier}
- **Preliminary chain**: `ADR -> {steps} -> OUTPUT`
- **Executor agent**: {agent-name} (reused|new)
- **References needed**:
  - `references/{file}.md` — {purpose}
- **Scripts needed**:
  - `scripts/{file}.py` — {purpose}
- **Profile-gated steps**: {list|none}
- **Cross-domain delegates**: {list|none}

### {Subdomain 2}: {name}
(same structure)

## Shared Resources

### References (span multiple subdomains)
- `{domain}-domain-knowledge.md` — {what it contains}

### Scripts (shared validators)
- `scripts/{domain}-{function}.py` — {what it validates}

### Hooks
- `hooks/{domain}-context-detector.py` — {what it detects, which skills it triggers}

## Routing Entries (to be created by routing-table-updater)
- Agent triggers: [{trigger keywords}]
- Skill triggers per subdomain: [{per-skill triggers}]

GATE: All of the following must be true before proceeding:

Component Manifest file exists at /tmp/pipeline-{run-id}/phase-4-produce/content.md
At least 2 subdomains listed with complete metadata
Every subdomain has: skill name, task type, complexity, preliminary chain, executor agent
Agent strategy documented (reuse vs. create, with rationale)
Shared resources identified
manifest.json written with status: "complete"

If gate passes: Report completion to pipeline-orchestrator-engineer. The Component Manifest is the handoff artifact for the chain-composer skill.

Error Handling

Error: Domain Too Narrow

Cause: Phase 1 discovers only 1 subdomain, or all discovered subdomains collapse to the same task type. Solution: Report to pipeline-orchestrator-engineer with findings. The domain may not need decomposition — it may be a single-pipeline domain. Ask: "Only 1 subdomain found for {domain}. Proceed as single pipeline or broaden the domain scope?"

Error: Research Agent Timeout

Cause: One or more parallel agents in Phase 1 exceed the 5-minute timeout. Solution: If 3+ agents completed, proceed with available data. If fewer than 3, retry the failed agent(s) once with a simplified prompt. If retry fails, proceed with available data and note the gap in the research compilation. Never retry more than once — move forward with what you have.

Error: No Existing Inventory Match

Cause: Agent 2 finds zero existing components for the domain. Solution: This is valid for novel domains. Set all subdomains to reuse potential "New" and note that a new agent will be needed. The Component Manifest should flag this prominently so the scaffolder creates the agent.

Error: Ambiguous Task Type

Cause: A subdomain's tasks span 3+ task types with no clear primary. Solution: Split the subdomain. If "Prometheus operations" covers debugging, monitoring, AND configuration, split into "Prometheus troubleshooting" (debugging), "Prometheus monitoring" (monitoring), and "Prometheus configuration" (configuration). Smaller, focused subdomains produce better pipeline chains than broad, unfocused ones.

Error: Type Incompatibility in Chain

Cause: Phase 3 chain validation finds that step A's output schema doesn't match step B's input requirements. Solution: Insert a bridging step. Common bridges:

Research Artifact needs to become Structured Corpus: insert COMPILE
Multiple Verdicts need to become one: insert AGGREGATE
Generation Artifact needs Verdict before next step: insert VALIDATE If no bridge works, restructure the chain. Never skip type validation.

Anti-Patterns

Anti-Pattern 1: Hardcoded Subdomain Lists

What it looks like: Skipping Phase 1 research and providing a predetermined list of subdomains Why wrong: Misses domain nuances, produces generic pipelines, defeats the purpose of the research phase. A human can guess "Prometheus has metrics and alerting" — the value of this skill is discovering the non-obvious subdomains (performance tuning, federation, recording rules). Do instead: Always run the full parallel research phase. Even for well-known domains, Agent 2 (Existing Inventory) and Agent 4 (Reference Research) will discover context the human prompt missed.

Anti-Pattern 2: One Agent Per Subdomain

What it looks like: Component Manifest creates 5 agents for 5 subdomains Why wrong: Agents are expensive context (loaded per session). Skills are cheap (loaded per task). Creating N agents where 1 agent + N skills would work wastes context budget and fragments routing. Do instead: Create at most 1 new agent per domain. Bind all subdomain skills to it (or to an existing agent that covers the domain). The ratio should be 1 agent : N skills, not 1:1.

Anti-Pattern 3: Over-Splitting Subdomains

What it looks like: Discovering 10+ subdomains for a moderate-complexity domain Why wrong: Produces too many narrow skills that overlap. Each skill has a fixed context cost (frontmatter + phases). 10 micro-skills may cost more context total than 5 well-scoped skills. Do instead: Target 3-7 subdomains for most domains. If you discover more, look for natural groupings. "Prometheus metric types", "Prometheus metric naming", and "Prometheus recording rules" can likely merge into "Prometheus metrics authoring".

Anti-Pattern 4: Sequential Research

What it looks like: Running Agent 1, waiting for results, then Agent 2, then Agent 3, then Agent 4 Why wrong: Rule 12 is not a suggestion — A/B testing proved parallel research produces measurably better output. Sequential research takes 4x longer and produces narrower findings because later agents don't benefit from the breadth that parallelism provides. Do instead: Always dispatch all 4 agents simultaneously. The gate requires 3 of 4 to succeed, not all 4 in sequence.

Anti-Pattern 5: Ignoring Reuse Assessment

What it looks like: Marking every subdomain as "New" without checking existing inventory Why wrong: Creates duplicate agents/skills that fragment routing and waste maintenance effort Do instead: Agent 2 specifically searches for existing components. Phase 2 must assess every subdomain against this inventory before marking anything as "New".

Anti-Rationalization

Rationalization Attempt	Why It's Wrong	Required Action
"I already know the subdomains, skip research"	You know SOME subdomains. Research finds the non-obvious ones.	Run all 4 agents. Phase 1 is not optional.
"Sequential research is fine for simple domains"	Rule 12 has no complexity exception. The A/B data applies universally.	Dispatch agents in parallel. Always.
"This subdomain needs its own agent"	Agents are for domains, skills are for subdomains. 1 agent : N skills.	Bind to existing/shared agent. Only create new agent if no existing agent covers 70%+ of domain.
"3 subdomains aren't enough, let me add more"	More subdomains ≠ better decomposition. Over-splitting is an anti-pattern.	Stop at natural boundaries. 3-7 is the target range.
"I'll just pick the obvious chain, no need for step menu lookup"	The step menu exists to prevent chain composition errors. It has type compatibility rules.	Load and reference the step menu. Validate types.
"Type compatibility is just bureaucracy"	It's the type system of pipeline composition. Invalid types produce broken chains.	Validate every adjacent pair. Insert bridges for incompatibilities.

Source Hierarchy

Research follows a strict source hierarchy. WHY: Stale training data is the most common research failure — hallucinated version numbers, deprecated APIs, and removed features all come from treating training data as current documentation.

Priority	Source	Confidence	When to Use
1	MCP/Context7 — current documentation via tool access	HIGH	Always try first. Guaranteed current. Use `resolve-library-id` then `query-docs` for any library research.
2	Official documentation — vendor/project docs via web fetch	HIGH	When Context7 doesn't cover the library, or for vendor-specific configuration guides
3	Web search — general web results	MEDIUM	Last resort. Always verify version numbers against source 1 or 2. Community posts may reference outdated APIs.
4	Training data — model knowledge without external verification	LOW	Only when sources 1-3 are unavailable. MUST be tagged LOW confidence regardless of how certain it appears.

If a finding comes from training data alone (no external verification), it MUST be tagged LOW confidence. This applies even when the model is highly confident — confidence is not currency without a source.

Confidence-Level Tagging

Every research finding is tagged with a confidence level. WHY: Without explicit confidence, consumers of research output treat all findings as equally reliable — which means a guess from training data carries the same weight as a verified API response. Confidence tagging forces the researcher to assess source quality and forces the consumer to calibrate trust.

Confidence Levels

Level	Source Examples	Presentation Rule
HIGH	Official documentation, verified API responses, source code inspection, Context7 query results	Present as authoritative. No caveats needed.
MEDIUM	Verified web search results, community consensus (multiple independent sources agree), well-maintained third-party docs	Present with source attribution: "According to [source]..."
LOW	Unverified sources, single blog post, training data without verification, inference from patterns	Present with explicit caveat: "[UNVERIFIED]" prefix. Never present as authoritative.

Rules

Every finding in the research output MUST have a confidence tag
LOW confidence findings are NEVER presented as authoritative — even in summary tables
If only LOW confidence information is available for a critical decision point, the research output MUST flag this as a verification gap: "No high-confidence source found for [topic]. Manual verification required before proceeding."
When multiple sources disagree, report the disagreement rather than picking one. Tag with the confidence of the highest-quality source and note the conflict.

Format in Output

## Findings

### [Topic]
**Confidence**: HIGH | Source: Context7 query, official docs
[Finding details...]

### [Topic 2]
**Confidence**: LOW | Source: Training data only
[UNVERIFIED] [Finding details...]
> Verification gap: No official documentation found for this behavior.
> Manual verification required before using in production.

"Don't Hand-Roll" Output Section

Research output includes a mandatory section listing problems that seem simple but have battle-tested library solutions. WHY: The most expensive bugs come from reimplementing solutions that already exist with years of production hardening, security patches, and edge case coverage. A hand-rolled JWT validator or rate limiter might pass tests but fail under adversarial conditions.

Format

Every research deliverable MUST include this section, even if empty (with "No hand-roll risks identified for this domain"):

## Don't Hand-Roll

| Problem | Library/Solution | Why Not DIY |
|---------|-----------------|-------------|
| [Problem that seems simple] | [Battle-tested solution] | [Specific edge cases or risks of DIY] |

Examples by Domain

Authentication/Security:

Problem	Library	Why Not DIY
JWT validation	golang-jwt/jwt, PyJWT	Signature verification edge cases, algorithm confusion attacks
Password hashing	bcrypt, argon2	Timing attacks, salt generation, work factor tuning
CSRF protection	Framework built-in (gorilla/csrf, Django CSRF)	Token generation, double-submit cookie pattern complexity

Concurrency:

Problem	Library	Why Not DIY
Rate limiting	uber-go/ratelimit, token-bucket	Token bucket correctness under concurrency, clock drift
Connection pooling	database/sql, pgxpool	Leak detection, health checking, graceful degradation
Worker pool	gammazero/workerpool, concurrent.futures	Panic recovery, graceful shutdown, backpressure

The researcher should populate this table with domain-specific entries based on the actual technology being researched, not just copy the examples above.

Anti-Features Output Section

Research output includes a mandatory section listing features to explicitly NOT build, with rationale. WHY: Explicitly naming what is out of scope is as valuable as naming what is in scope. Without this, scope creep happens through "while we're at it" additions that seem reasonable in isolation but compound into over-engineering.

Format

Every research deliverable MUST include this section:

## Anti-Features

Features to explicitly NOT build:

| Feature | Rationale |
|---------|-----------|
| [Feature that might seem useful] | [Why building it would be a mistake] |

Common Anti-Feature Categories

Build vs. Buy: "Custom auth system" — Use OAuth2 provider; auth is a liability, not a differentiator
Premature Abstraction: "Plugin system" — No current need for extensibility; add when 3+ real use cases emerge
Complexity Without Demand: "GraphQL API" — REST covers all current use cases; GraphQL adds complexity without user demand
Scope Creep: "Admin dashboard" — CLI tooling covers current admin needs; UI adds maintenance burden without proportional value
Future-Proofing: "Multi-region support" — Single region serves current scale; premature distribution adds latency debugging complexity

The researcher should identify anti-features specific to the domain being researched, not generic ones. Ask: "What will someone inevitably suggest adding that we should preemptively say no to?"

Blocker Criteria

STOP and ask the pipeline-orchestrator-engineer (do NOT proceed autonomously) when:

Situation	Why Stop	Ask This
Only 1 subdomain discovered	Domain may not need decomposition	"Only 1 subdomain found. Proceed as single pipeline or broaden scope?"
8+ subdomains discovered	May be over-splitting	"Found {N} subdomains — should I merge related ones or keep all?"
Existing agent covers 90%+ of domain with existing skills	May not need new pipelines at all	"Existing components cover nearly everything. What specifically is missing?"
No existing agent AND domain is well-established	Surprising — may indicate search failure	"Found no existing agent for {domain}. Verify this is correct before creating new one?"
Two subdomains have identical preliminary chains	May be duplicates that should merge	"{Sub A} and {Sub B} have the same chain. Merge them?"

Never Guess On

Whether to create a new agent vs. reuse an existing one (always check inventory first)
How many subdomains a domain should have (discover, don't prescribe)
Which operator profile to apply (detect from context or use default)
Whether a subdomain is too narrow or too broad (ask when uncertain)

References

Task Type Guide: references/task-type-guide.md — Detailed task type definitions with canonical chains and examples (loaded in Phase 2)

domain-research

Domain Research Skill

Operator Context

Hardcoded Behaviors (Always Apply)

Default Behaviors (ON unless disabled)

Optional Behaviors (OFF unless enabled)

What This Skill CAN Do

What This Skill CANNOT Do

Instructions

Inputs

Phase 1: DISCOVER (Parallel Multi-Agent — Rule 12 Mandatory)

Phase 2: CLASSIFY

Phase 3: MAP (Compose Preliminary Chains)

Phase 4: PRODUCE (Component Manifest)

Error Handling

Error: Domain Too Narrow

Error: Research Agent Timeout

Error: No Existing Inventory Match

Error: Ambiguous Task Type

Error: Type Incompatibility in Chain

Anti-Patterns

Anti-Pattern 1: Hardcoded Subdomain Lists

Anti-Pattern 2: One Agent Per Subdomain

Anti-Pattern 3: Over-Splitting Subdomains

Anti-Pattern 4: Sequential Research

Anti-Pattern 5: Ignoring Reuse Assessment

Anti-Rationalization

Source Hierarchy

Confidence-Level Tagging

Confidence Levels

Rules

Format in Output

"Don't Hand-Roll" Output Section

Format

Examples by Domain

Anti-Features Output Section

Format

Common Anti-Feature Categories

Blocker Criteria

Never Guess On

References

More from notque/claude-code-toolkit

generate-claudemd

fish-shell-config

pptx-generator

codebase-overview

image-to-video

data-analysis