helpmetest-troubleshoot
Who you are: If
.helpmetest/SOUL.mdexists in this project, read it before starting — it defines your character and shapes how you work.
No MCP? The CLI has full feature parity — use
helpmetest <command>instead of MCP tools. See the CLI reference.
Troubleshoot
Fast triage when something broke and you don't know what. The goal is to get from "something is wrong" to "here's what's wrong and what to do about it" in one pass — not to fix everything, but to diagnose clearly and hand off to the right tool.
Philosophy
Don't investigate deeply. Investigate broadly and quickly, then narrow. You're looking for the signal that explains the most failures with the fewest assumptions. Once you have a plausible cause, stop gathering data and state your diagnosis.
Step 1: Capture What the User Knows
Before gathering data, extract anything the user already mentioned:
- Specific error messages or symptoms → note them, they're your first leads
- A time when it last worked ("worked an hour ago", "broke after the deploy") → anchor for checking recent changes
- A URL, service, or feature ("the checkout page", "the login flow") → narrows scope
- Whether tests, the live site, or both are affected → determines where to look first
Step 2: Check Current Failure State
Get the full picture of what's failing right now:
helpmetest_status({ verbose: true })
Look for:
- How many tests are failing? (1-2 suggests a specific breakage; many suggests a systemic change)
- Do the failing tests share a
feature:Xtag? (points to a specific feature regressing) - Are health checks failing? (points to infrastructure/deployment)
- When did the failures start? (correlates with code changes)
Step 3: Check Recent Changes
git log --oneline -10
git diff --stat HEAD~3..HEAD
Look for:
- Commits in the last few hours/days (especially deploys, dependency bumps, config changes)
- Which file paths changed → map to feature domains
- Whether the changed domains match the failing test tags
If no git repo or no recent commits, skip this step.
Step 4: Correlate and Diagnose
Connect what you found in steps 2 and 3. The most common patterns:
| Observation | Likely cause |
|---|---|
| Many tests failing + recent deploy | Deployment broke something systemic (auth, DB, env config) |
| Tests failing in one feature + matching files changed | Regression from a specific code change |
| Health checks failing + server errors in tests | Infrastructure issue (down service, bad config) |
| Tests alternating pass/fail + no recent changes | Test isolation issue (shared state, ordering) |
| All tests failing + no code changes | Environment issue (proxy down, credentials expired, service outage) |
| One test failing + no pattern to other failures | Likely a selector change or timing issue in that specific test |
State your diagnosis explicitly:
Diagnosis: [one sentence on most likely cause]
Evidence:
- [X] tests failing, all tagged feature:checkout
- Last commit 2h ago changed checkout/OrderSummary.jsx
- Failure started around the same time as the commit
Confidence: high / medium / low
Step 5: Propose Next Steps
Based on the diagnosis, hand off to the right skill:
If it's a specific test failure (1-3 tests, clear error):
→ "Let me debug this with /helpmetest-debugger"
If it's many tests failing after a code change (systemic regression):
→ "This looks like a regression from [commit]. Use /helpmetest-self-heal to fix the affected tests, or roll back the change."
If it's infrastructure/health (health checks down, server errors): → "The issue looks like it's at the infrastructure level — tests can't reach the app. Check [service/proxy/env]. Once the service is back, run the tests again."
If it's a deployment issue:
→ "The deploy at [time] likely introduced this. Consider rolling back [commit], or use /helpmetest-debugger on [specific failing test] to confirm."
If the cause is unclear (low confidence): → "I'm not certain, but the most likely lead is [X]. Want me to dig deeper into [specific area]?"
Output
A triage report:
## Triage Report
**What's failing:** [N tests / health checks / both]
**Feature area:** [which features, if identifiable]
**Recent changes:** [key commits / file areas]
**Diagnosis:** [one sentence]
**Confidence:** high / medium / low
**Next step:** [specific action or skill to use]
Keep it short. This is a diagnosis, not an investigation report. If the user needs more detail, they can follow up with the appropriate specialist skill.
More from help-me-test/skills
helpmetest
Full site QA — discover, enumerate features, write and run tests, report bugs. Use when user says 'test this site', 'qa this', 'check site', 'find bugs', or provides a URL and wants comprehensive coverage. This is the orchestrator — it covers everything from first visit through final report.
39helpmetest-self-heal
Autonomous test maintenance agent. Monitors test failures and fixes them automatically. Always use this when tests start failing after a UI or code change — it's far more systematic than trying to fix tests manually one by one. Use when user mentions 'fix failing tests', 'heal tests', 'auto-fix', 'monitor test health', 'tests broke after deploy', or test suite has multiple failures needing systematic repair. Distinguishes fixable test issues (selector changes, timing) from real application bugs.
30helpmetest-debugger
When a test is broken and you need to know why, use this skill. It handles: a named test failing with an error message (element not found, selector not found, timeout), determining whether the failure is a bug in the app vs. a broken test, a test that passes locally but fails on CI, and multiple tests regressing after a deploy or PR merge. The distinguishing signal: the user has a failing test and wants root cause, not just a re-run. Do not use for: writing new tests, running test suites, exploring features, or reviewing test logic in the abstract.
26helpmetest-validator
Invoke this skill when a user shares test code and questions whether it actually works as intended — not to run or fix the test, but to evaluate whether the test has real value. Triggers on: \"is this test any good?\", \"would this catch a real bug?\", \"this test always passes — is that normal?\", \"review these tests before I commit\", or \"does this test verify anything meaningful?\". Also triggers when someone suspects a test is useless, wants a pre-commit quality gate, or is unsure if an auto-generated test is worth keeping. The core question this skill answers: \"Would this test fail if the feature broke?\" If not, the test gets rejected. Do NOT use for generating new tests, fixing failing tests, or exploring application features.
26helpmetest-visual-check
Instant visual verification via screenshots. For quick checks like 'does button look blue', 'is layout centered', 'header look right on mobile'. Fast alternative to formal testing - just look and confirm. Use when user wants visual inspection without creating test files.
24fix-tests
Everything wrong with your tests — use this skill. Triggers on: 'test is failing', 'fix this test', 'why is this test red', 'test keeps failing', 'tests broke after deploy', 'tests broke after the update', 'something broke', 'it was working before', 'tests are flaky', 'test passes sometimes fails sometimes', 'are these tests any good', 'is this test worth keeping', 'review these tests', 'tests are out of date', 'tests don't match the code anymore', 'tests are stale after refactor', 'I think my tests are wrong', 'tests pass but I don't trust them', 'can you check if my tests actually work'. Picks the right mode automatically: triage (don't know what's wrong), debug (one failure, root cause), heal (bulk failures after deploy), sync (drift audit after refactor), validate (quality scoring — would this test catch a real bug?). You don't need to know the mode. Just say what's wrong.
8