helpmetest-debugger
Who you are: If
.helpmetest/SOUL.mdexists in this project, read it before starting — it defines your character and shapes how you work.
No MCP? The CLI has full feature parity — use
helpmetest <command>instead of MCP tools. See the CLI reference.
QA Debugger
Debugs failing tests to determine root cause and fix.
Golden Rule: ALWAYS reproduce interactively before fixing. Never guess. Never make blind fixes.
Prerequisites
Before debugging, load the project context and debugging guides:
how_to({ type: "context_discovery" })
how_to({ type: "interactive_debugging" })
how_to({ type: "debugging_self_healing" })
context_discovery tells you which Feature artifact the failing test belongs to, what scenarios are covered, and whether the bug is already known. This prevents documenting a bug that's already in feature.bugs[].
Input
- Failing test ID
- Error message/logs
- Feature artifact the test belongs to
Before asking the user for any of these, scan the conversation — they're almost always already there.
Context Reading (do this before Phase 1)
Scan the conversation history for:
- Test ID: any mention of a test name, test ID, or a test runner output block → use it directly, don't ask
- Error message: any stack trace, "Element not found", timeout message, or assertion failure → pre-classify the failure type before loading test details
- Feature domain: any mention of the feature being tested (e.g., "the login test broke") → pre-load the matching Feature artifact
Then check what code recently changed:
git diff --stat HEAD
git log --oneline -5
Map changed files to likely failure causes:
- Selector-heavy files (
components/,pages/, template files) → likely selector changes - Auth/session files → likely auth state or session issues
- API route files → likely backend errors or changed response shapes
- Config or env files → likely timing or environment issues
Use this to pre-classify the failure pattern before any interactive investigation. If you already know "auth files changed + test fails on login step", you can skip several investigation dead ends and go straight to checking whether the auth selector or flow changed.
If no context signals exist, proceed with Phase 1 and ask the user for the test ID and error.
Workflow
Phase 1: Understand the Failure
- Get test details using
helpmetest_open_test - Get test history using
helpmetest_status({ id: "test-id", testRunLimit: 10 }) - Classify failure pattern (see the self-healing guide loaded above):
- Selector issue? Timing issue? State issue? API issue? Data issue? Test isolation?
- Refer to pattern categories to identify symptoms and determine root cause
- Get Feature using
helpmetest_get_artifact - Identify failure point:
- Which step failed?
- What was the error?
- What was expected vs actual?
Phase 2: Reproduce Interactively
Don't skip this step. Use helpmetest_run_interactive_command to execute the test manually, step by step. This shows you exactly what's happening when the test fails - you can see the actual page state, missing elements, and error conditions that logs don't capture.
-
Get Persona for auth_state if needed
-
Start interactive session - Execute steps ONE AT A TIME, observing after EACH step:
As <auth_state> Go To <url>→ Observe: Did page load? Check URL. Look at page content.
-
Execute steps leading up to failure - Run each step separately:
Fill Text <selector> <value>→ Observe: Did field get filled? Check with
Get Attribute <selector> valueClick <button-selector>→ Observe: What happened after click? Did page change? Check
Get Url -
When you reach the failing step, investigate based on error type:
Element not found:
- List all elements of that type on the page (buttons, inputs, etc.)
- Try alternate selectors (text matching, class, attributes)
- Determine: Is element missing (bug) OR selector wrong (test issue)?
Element not interactable:
- Check if element is visible and enabled
- Check if need to scroll to element
- Check if multiple matches exist
- Check if need to wait for element to become interactable
- Determine: What's blocking interaction?
Wrong value/assertion failed:
- Check what's actually displayed
- Check what page you're on
- Check for error messages on page
- Determine: Is behavior a bug OR did expectations change?
Timeout/page load failure:
- Try with longer timeout
- Check where you ended up (Get Url)
- Check for API responses
- Wait for key elements that indicate page loaded
- Determine: Is app slow (increase timeout) OR broken (bug)?
-
Document findings after investigation:
- Exact step that failed
- What you expected
- What actually happened
- Root cause (selector wrong, timing issue, feature broken, etc.)
Phase 3: Determine Root Cause
Refer to the self-healing guide (loaded above via how_to) for all failure pattern categories and fixes.
Quick reference:
- Test isolation: Alternating PASS/FAIL + changing error values + shared state → Make idempotent
- Selector issues: Element not found → Fix selector or document bug
- Timing issues: Timeouts, element not visible → Add waits
- State issues: Auth/session problems → Verify state restoration
- API issues: Backend errors → Document as bug
- Data issues: Conflicts, duplicates → Use unique data or cleanup
Phase 4A: Fix Test Issue
If problem is with the test (NOT a bug in the application):
-
Identify fix based on interactive investigation:
- Update selector (found correct selector during investigation)
- Add wait/timeout (saw element appears after delay)
- Fix expected value (saw actual correct value during investigation)
- Add missing setup step (discovered required state during investigation)
-
Validate fix interactively before updating test:
Run the COMPLETE corrected flow interactively to prove it works. Don't update the test until you've confirmed the entire flow succeeds interactively - otherwise you're applying an unverified fix.
# Run the FULL test with fixes applied As <auth_state> Go To <url> # Step 1 with fix Fill Text <corrected-selector> <value> Get Attribute <corrected-selector> value == <value> # Verify it worked # Step 2 with fix Wait For Elements State <button> enabled timeout=5000 # Added wait Click <button> # Step 3 with fix Wait For Response url=/api/submit status=200 # Verify API call Get Text .success == Saved successfully # Updated expected text # Verify outcome Go To <url> # Reload to verify persistence Get Attribute <selector> value == <value> # Confirm data saved→ If ANY step fails during interactive validation, STOP and investigate more → Only proceed when ENTIRE flow runs successfully interactively
-
Update test using
helpmetest_upsert_testwith validated fixes -
Run test using
helpmetest_run_testto confirm automated version works -
If test STILL fails:
- Return to Phase 2 (reproduce interactively)
- Compare automated test output with interactive output
- Find the difference
- Repeat until test passes
Phase 4B: Document Bug
If problem is with the feature:
-
Add bug to Feature.bugs:
{ "name": "Brief description", "given": "Precondition", "when": "Action taken", "then": "Expected outcome", "actual": "What actually happens", "severity": "blocker|critical|major|minor", "url": "http://example.com/page", "tags": ["affects:all-users"], "test_ids": [] } -
Update Feature.status to "broken" or "partial"
-
Skip or tag test until bug is fixed
Phase 5: Update Feature Status
After debugging:
-
Update Feature.status:
- "working": All scenarios pass
- "broken": Critical scenarios fail due to bugs
- "partial": Some scenarios pass, some fail
-
Update ProjectOverview.features with works: true/false
Output
- Root cause identified
- Test fixed OR bug documented
- Feature.status updated
- Summary of actions taken
Self-Healing Patterns
Refer to the self-healing guide (loaded above via how_to) for all auto-fixable patterns and strategies.
Critical Rules
- Always reproduce first - Don't guess, verify
- Determine bug vs test issue - Different actions needed
- Update artifacts - All findings go into Feature
- Verify fixes - Run test after fixing
Version: 0.1
More from help-me-test/skills
helpmetest
Full site QA — discover, enumerate features, write and run tests, report bugs. Use when user says 'test this site', 'qa this', 'check site', 'find bugs', or provides a URL and wants comprehensive coverage. This is the orchestrator — it covers everything from first visit through final report.
39tdd
Everything to do with tests on HelpMeTest. Use when: writing tests for a new feature, generating tests for an existing feature, fixing a broken test, debugging a failing test, tests broke after a UI change, tests are out of date after a refactor. Triggers on: 'write tests', 'generate tests', 'test is failing', 'fix tests', 'tests broke', 'implement X', 'add feature', 'fix bug', 'why does this test fail', 'tests are out of date'. If it involves HelpMeTest tests in any way, this is the skill.
36helpmetest-self-heal
Autonomous test maintenance agent. Monitors test failures and fixes them automatically. Always use this when tests start failing after a UI or code change — it's far more systematic than trying to fix tests manually one by one. Use when user mentions 'fix failing tests', 'heal tests', 'auto-fix', 'monitor test health', 'tests broke after deploy', or test suite has multiple failures needing systematic repair. Distinguishes fixable test issues (selector changes, timing) from real application bugs.
30helpmetest-validator
Invoke this skill when a user shares test code and questions whether it actually works as intended — not to run or fix the test, but to evaluate whether the test has real value. Triggers on: \"is this test any good?\", \"would this catch a real bug?\", \"this test always passes — is that normal?\", \"review these tests before I commit\", or \"does this test verify anything meaningful?\". Also triggers when someone suspects a test is useless, wants a pre-commit quality gate, or is unsure if an auto-generated test is worth keeping. The core question this skill answers: \"Would this test fail if the feature broke?\" If not, the test gets rejected. Do NOT use for generating new tests, fixing failing tests, or exploring application features.
26helpmetest-test-generator
Use this skill when the user wants tests written for a specific feature or flow. Triggers on: \"write tests for X\", \"generate tests for checkout\", \"create tests for login\", \"add tests for registration\", \"we have scenarios — now write the tests\", or any request to produce automated test coverage for a known feature. Also triggers when discovery is done and the user is ready to move from documenting scenarios to actually testing them. Not for: exploring a site to discover what to test, judging whether an existing test is good, or debugging a failing test.
26helpmetest-visual-check
Instant visual verification via screenshots. For quick checks like 'does button look blue', 'is layout centered', 'header look right on mobile'. Fast alternative to formal testing - just look and confirm. Use when user wants visual inspection without creating test files.
24