debug-root-cause
Debug Root Cause
Overview
Use this skill to move from symptom to confirmed root cause with minimal guesswork and auditable evidence.
Scope Boundaries
- Use this skill when the task matches the trigger condition described in
description. - Do not use this skill when the primary task falls outside this skill's domain.
- If available, prefer
github-fix-ciwhen the primary symptom is a GitHub Actions check failure. - If available, prefer
git-bisect-debuggingwhen the unresolved question is "which commit introduced the regression?" - If available, prefer
performance-profilingwhen profiler evidence and optimization prioritization are the main objective. - If available, prefer
security-incident-responsefor active compromise triage/containment/eradication. - If available, prefer
incident-postmortemwhen incident stabilization is complete and the task is retrospective analysis.
Shared References
- Strategy matrix:
references/debug-strategy-matrix.md
Templates And Assets
- Session log template:
assets/debug-session-log-template.md
- Fix verification checklist:
assets/debug-fix-verification-checklist.md
Inputs To Gather
- Stable reproduction steps and failure evidence (logs, traces, failing tests, timestamps).
- Expected behavior and actual behavior with explicit mismatch.
- Suspected boundary (module/service/config/dependency) and recent changes.
- Runtime constraints (environment, dataset, flags, concurrency/load profile).
- Success condition that proves the issue is fixed.
Deliverables
- Confirmed root cause statement with reproducible evidence.
- Implemented fix tied to the causal chain.
- Verification evidence that the failure no longer reproduces.
- Residual risks and follow-up items outside current scope.
Workflow
- Stabilize reproduction and capture a baseline using
assets/debug-session-log-template.md. - Define the failure contract: expected vs actual behavior, first observable break, and affected boundary.
- Build hypotheses and prioritize by likelihood, blast radius, and experiment cost.
- Run controlled experiments by changing one variable at a time; record outcomes.
- Confirm root cause by demonstrating both removal and reintroduction criteria where safe.
- Select a remediation scope that matches project reality: required behavior, current architecture, operational constraints, and delivery risk.
- Implement a minimal root-cause fix; avoid symptom-only guards, speculative hardening, hidden defaults, and unrelated refactors.
- Verify using the checks in
assets/debug-fix-verification-checklist.md. - Publish an investigation summary with evidence, implementation rationale, and follow-up actions.
Remediation Fit Guardrails
- Prefer the smallest change that reliably resolves the observed failure mode.
- Match strictness and complexity to explicit requirements and real operational risk.
- Avoid adding framework-level abstractions or future-proofing not required by current constraints.
- Keep compatibility decisions explicit; do not retain legacy paths unless required by active consumers.
- When multiple fixes are possible, choose the option with the lowest long-term maintenance cost at acceptable risk.
Quality Standard
- Reproduction is stable enough for repeated validation.
- Root cause is demonstrated by evidence, not inference only.
- Implemented fix scope is proportional and targets the causal path directly.
- Regression/edge checks are updated for the discovered failure mode.
- Logs/metrics/error surfaces remain actionable after the fix.
Failure Conditions
- Stop when reproduction is nondeterministic and cannot be stabilized.
- Stop when required environment or data access is unavailable.
- Escalate when issue ownership crosses teams or requires architectural change.
More from kentoshimizu/sw-agent-skills
graph-algorithms
Graph algorithm workflow for modeling entities/relations and selecting traversal, path, ordering, or flow strategies. Use when correctness or performance depends on graph representation and algorithm choice; do not use for schema-only modeling or deployment topology planning.
14bash-style-guide
Style, review, and refactoring standards for Bash shell scripting. Trigger when `.sh` files, files with `#!/usr/bin/env bash` or `#!/bin/bash`, or CI workflow blocks with `shell: bash` are created, modified, or reviewed and Bash-specific quality controls (quoting safety, error handling, portability, readability) must be enforced. Do not use for generic POSIX `sh`, PowerShell, or language-specific application style rules. In multi-language pull requests, run together with other applicable `*-style-guide` skills.
11architecture-clean-architecture
Clean Architecture workflow for enforcing dependency direction, stable domain boundaries, and use-case-centered application design. Use when teams must separate business rules from frameworks and delivery mechanisms; do not use for isolated module cleanup without boundary implications.
11powershell-style-guide
Style, review, and refactoring standards for PowerShell scripting. Trigger when `.ps1`, `.psm1`, `.psd1` files, or CI workflow blocks with `shell: pwsh` or `shell: powershell` are created, modified, or reviewed and PowerShell-specific quality controls (error handling, parameter validation, readability, operational safety) must be enforced. Do not use for Bash, generic POSIX `sh`, or language-specific application style rules. In multi-language pull requests, run together with other applicable `*-style-guide` skills.
10github-codeowners-management
Govern CODEOWNERS rules so review routing reflects real ownership and risk boundaries on GitHub. Use when repository ownership mapping or mandatory reviewer rules must be defined, updated, or audited; do not use for non-GitHub runtime architecture or data-layer design.
9security-authentication
Security workflow for authentication architecture, credential lifecycle, and session/token assurance. Use when login, identity proofing, MFA, or session security decisions are required; do not use for authorization policy design or non-security quality tuning.
9