helpmetest-test-generator
Who you are: If
.helpmetest/SOUL.mdexists in this project, read it before starting — it defines your character and shapes how you work.
No MCP? The CLI has full feature parity — use
helpmetest <command>instead of MCP tools. See the CLI reference.
QA Test Generator
Generates tests for ONE feature at a time.
Scope: This skill works on a SINGLE FEATURE. Use /helpmetest for comprehensive site testing.
This skill is for PHASE 3 (Test Generation) - use it only after feature discovery is complete.
Why this matters: Tests written without complete scenario enumeration end up being blind guesses. You need to know what SHOULD happen (from interactive exploration) before you can write a test that verifies it happens.
Prerequisites before using this skill:
- ✅ ALL features have been discovered
- ✅ ALL scenarios have been enumerated (functional + edge_cases + non_functional)
- ✅ Features have been explored interactively to discover test scenarios
- ✅ Phase 2 (Feature Enumeration) is 100% complete
If you're still discovering features or enumerating scenarios, use /helpmetest-discover instead.
Prerequisites
Before generating tests, load the quality standards and debugging guidance. These define what makes a good test vs a bullshit test, and how to fix failures when they happen.
Call these before starting:
how_to({ type: "test_quality_guardrails" })
how_to({ type: "tag_schema" })
how_to({ type: "interactive_debugging" })
Input
- Feature artifact ID (e.g., "feature-checkout") - optional if can be inferred
Workflow
Phase 0: Context Discovery
Check for existing work before asking the user for input. This prevents redundant questions and lets you resume where you left off.
Call how_to({ type: "context_discovery" }) to see what's already been done.
Before asking the user which feature to test, read conversation and code context to propose the right one:
a) Scan the conversation history for:
- Feature or component names mentioned (e.g., "I fixed the registration form") → that's the feature to test
- Bug descriptions or error messages → regression test for whatever was just broken
- A URL or page the user was working on → look for a Feature artifact covering that page
b) Check uncommitted code changes:
git status --short
git diff --stat HEAD
Map changed file paths to a feature domain (e.g., auth/ → auth, checkout/ → checkout, components/ProfileForm.jsx → profile). Then search for an existing Feature artifact with a matching feature:X tag using helpmetest_search_artifacts.
c) If a matching Feature artifact is found → propose it:
Based on our conversation, you were working on [what you saw].
I found Feature artifact [name] — want me to generate tests for it?
d) If no matching Feature artifact exists → propose creating one first:
I don't see a Feature artifact for [domain] yet.
I'd suggest: first run /helpmetest-discover to enumerate the scenarios, then come back here to generate tests.
Or I can create a basic Feature artifact now based on what we discussed — want me to do that?
e) If no context signals at all → fall back to standard selection:
- Prioritize features with
status: "untested" - Find scenarios with empty
test_idsarrays - Sort scenarios by priority - Test
priority:criticalscenarios first. These are the end-to-end flows that prove the core business value works. If you test 10 partial scenarios but skip the 1 critical end-to-end flow, you've missed the most important verification. - Validate coverage before claiming done - Before saying "all features tested", check that ALL
priority:criticalscenarios have test_ids. Otherwise you're claiming coverage you don't have. - If user says "continue"/"next" → auto-select first untested feature with critical scenarios
Phase 1: Understand the Feature
- Read Feature artifact using
helpmetest_get_artifact - Extract:
- goal - What must work?
- persona_ids - Who can use this? (get auth_state from Persona)
- functional - Main scenarios (Given/When/Then)
- edge_cases - Boundary conditions
- bugs - Known broken scenarios (skip these)
Phase 1.5: Explore Before Writing
Before generating any test, run the scenario interactively using helpmetest_run_interactive_command. This isn't optional cleanup for vague scenarios — it's the default workflow for every scenario.
Why: a test written from a scenario description is a hypothesis. A test written after you ran it is a specification. The second kind uses real selectors, reflects actual UI behavior, and fails for the right reasons.
For each scenario (starting with priority:critical):
- Authenticate —
As <auth_state>to establish the session - Navigate —
Go To <scenario.url>and observe what loads - Execute the Given — establish the precondition, confirm it's in place
- Execute the When — perform the action, observe what happens next
- Verify the Then — check the outcome. Try the actual assertion selectors.
- For edge cases — try the invalid input, observe the exact error message and selector
Email/registration fields: Use Create Fake Email or Create Email And Fill — never test@example.com. Hardcoded emails break on second run (account already exists), making exploration results unreliable.
After each interactive run, you have:
- Confirmed selectors (not guessed
data-testidattributes that may not exist) - Actual timing behavior (know where waits are needed)
- The real outcome text (not assumed "Success" when it says "Saved")
If the backend is unavailable: Fall back to extracting selectors from existing passing tests for the same feature (via helpmetest_status with verbose flag). This is the fallback, not the default.
Phase 2: Generate Tests for Scenarios
Test generation order matters:
- Generate
priority:criticalscenarios FIRST - These are end-to-end workflows that verify core business value - Then generate
priority:highscenarios - Important but not mission-critical - Finally generate
priority:medium/lowscenarios - Nice-to-have coverage
Why this order? Critical scenarios prove the feature actually works for users. If you test 10 edge cases but never verify the happy path works end-to-end, you've missed the point. Test the core transaction first, then fill in the edges.
Critical vs. Partial test distinction:
- ❌ PARTIAL test - "User can view form page" (just navigation, verifies form fields exist)
- ✅ CRITICAL test - "User completes the workflow" (full end-to-end transaction from start to confirmation)
Before claiming a feature is "tested":
- ALL
priority:criticalscenarios needtest_idspopulated - If ANY critical scenario is untested, the feature status is "untested"
- Don't generate 10 partial tests and skip the 1 critical end-to-end flow - that's false coverage
For each functional scenario (starting with critical), create test using helpmetest_upsert_test:
*** Test Cases ***
<scenario.name>
[Documentation] Given: <given> | When: <when> | Then: <then>
[Tags] priority:high feature:<feature_id> project:<project_id>
# Setup - authenticate if needed
As <auth_state>
Go To <scenario.url>
# Given - establish precondition
<steps to establish 'given' state>
# When - perform action
<steps for 'when' action>
# Then - verify outcome
<assertions for 'then' expected outcome>
Test naming rules:
- NO project/site names in test names (e.g., NOT "EverShop User Login")
- Use feature descriptions from scenario.name (e.g., "User can login with valid credentials")
- Format:
<Actor> can <action> <object>OR<Feature> <behavior> - Examples:
- ✅ "User can update profile email"
- ✅ "Cart total updates when quantity changes"
- ✅ "Registration validates email format"
- ❌ "EverShop Registration Test"
- ❌ "QA Playground Login"
- ❌ "SiteName Profile Update"
Test requirements:
- 5+ meaningful steps
- Verify actual business outcomes
- Use stable selectors
- Include relevant assertions
- Use
Create Fake Emailfor registration/email tests (never hardcode emails)
Phase 3: Generate Edge Case Tests
For critical edge_cases:
*** Test Cases ***
Edge Case: <scenario.name>
[Documentation] Given: <given> | When: <when> | Then: <then>
[Tags] priority:medium feature:<feature_id> project:<project_id>
# Setup and execute edge case scenario
...
Phase 4: Validate Test Quality
Every test needs validation before running. This catches bullshit tests early - tests that verify page structure instead of business functionality, or tests that would pass even if the feature is broken.
Validate each test using /helpmetest-validator:
- Pass test ID and feature ID to validator
- Validator checks:
- Business value (would test fail if feature broken?)
- No bullshit patterns (element counting, click without verification)
- Proper assertions and state verification
- If REJECTED:
- Read validator feedback
- Fix test based on specific issues
- Validate again
- Repeat until PASSED
- If PASSED:
- Continue to Phase 6
Don't skip validation - running unvalidated tests wastes time when they fail for the wrong reasons or pass when the feature is broken.
Phase 5: Link Tests to Scenarios
Update Feature artifact - add test_id to each scenario's test_ids:
{
"functional": [
{
"name": "Create new item",
"given": "...",
"when": "...",
"then": "...",
"test_ids": ["test-create-new-item"]
}
]
}
Phase 6: Run and Debug Tests
-
Run test using
helpmetest_run_test -
If test PASSES:
- ✅ Move to next scenario
- Update scenario status
-
If test FAILS - use
/helpmetest-debugger:Don't guess or make blind fixes. The debugger skill reproduces failures interactively, identifies whether the problem is a test issue (fixable) or an application bug (document in Feature.bugs), and validates fixes before applying them.
Pass it: the failing test ID, the error message, and the Feature artifact ID.
-
Update Feature.status based on results:
- All tests pass → "working"
- All tests fail due to bugs → "broken"
- Mixed → "partial"
Output
- Test files created via
helpmetest_upsert_test - All tests debugged and fixed OR bugs documented
- Feature artifact updated with test_ids on scenarios
- Feature.bugs[] populated for application bugs
- Feature.status updated
- Summary of tests generated, pass rates, bugs found
Test Quality Requirements
Before creating tests, review what makes a test valuable vs worthless. This prevents writing tests that pass when the feature is broken, or tests that only verify page structure instead of functionality.
Load the quality standards:
how_to({ type: "test_quality_guardrails" })
This includes:
- ❌ Bullshit test anti-patterns (element counting, click without verification, etc.)
- ✅ Real test patterns (complete workflows with outcome verification)
- Quality checklist (5+ steps, business outcomes, state verification)
- Red flags to avoid
Quality Checklist
Before creating ANY test, answer:
- "What business capability does this test verify?"
- "If this test passes but the feature is broken, is that possible?"
- If YES → Test is bullshit, rewrite it
- If NO → Test is valid
A good test includes:
- 5+ meaningful steps (not just navigation + counting)
- Verifies business outcome (data saved, filter applied, order created)
- Includes state change verification (before/after comparison OR API response check)
- Uses stable selectors (data-testid preferred)
- Has proper assertions that would FAIL if feature broken
- Was validated interactively first
A bullshit test has:
- Only navigation + element counting
- Click without verifying result of click
- Form display without testing form submission
- Success message verification without checking data persisted
FakeMail Keywords
For registration and email verification tests, use FakeMail:
# Generate unique test email
${email}= Create Fake Email
# Returns: adventure.acme@fakemail.helpmetest.com
# Use in registration
Fill Text input[name=email] ${email}
Click button >> "Register"
# Get verification code from email
${code}= Get Email Verification Code ${email}
Fill Text input[name=code] ${code}
# Cleanup after test
Delete Email ${email}
Always use Create Fake Email instead of hardcoding emails. Hardcoded emails cause conflicts when tests run in parallel or multiple times - the second run fails because the email already exists.
Critical Rules
- No bullshit tests - "Click button, verify page" is NOT a test
- Test interactively first - Never create blind tests
- Given/When/Then - Map scenario to test steps
- Link to scenario - Add test_id to scenario.test_ids
- Use FakeMail - Never hardcode test emails
- Tag properly - All tests need
priority:tag minimum
Version: 0.1
More from help-me-test/skills
helpmetest
Full site QA — discover, enumerate features, write and run tests, report bugs. Use when user says 'test this site', 'qa this', 'check site', 'find bugs', or provides a URL and wants comprehensive coverage. This is the orchestrator — it covers everything from first visit through final report.
39tdd
Everything to do with tests on HelpMeTest. Use when: writing tests for a new feature, generating tests for an existing feature, fixing a broken test, debugging a failing test, tests broke after a UI change, tests are out of date after a refactor. Triggers on: 'write tests', 'generate tests', 'test is failing', 'fix tests', 'tests broke', 'implement X', 'add feature', 'fix bug', 'why does this test fail', 'tests are out of date'. If it involves HelpMeTest tests in any way, this is the skill.
36helpmetest-self-heal
Autonomous test maintenance agent. Monitors test failures and fixes them automatically. Always use this when tests start failing after a UI or code change — it's far more systematic than trying to fix tests manually one by one. Use when user mentions 'fix failing tests', 'heal tests', 'auto-fix', 'monitor test health', 'tests broke after deploy', or test suite has multiple failures needing systematic repair. Distinguishes fixable test issues (selector changes, timing) from real application bugs.
30helpmetest-visual-check
Instant visual verification via screenshots. For quick checks like 'does button look blue', 'is layout centered', 'header look right on mobile'. Fast alternative to formal testing - just look and confirm. Use when user wants visual inspection without creating test files.
24helpmetest-discover
Use this skill when the user doesn't yet know what to test. This is the \"learn the site first\" step — for unfamiliar websites, new projects, or any situation where Feature/Persona artifacts don't exist yet. Use when the user: gives a URL with no specific test in mind, asks what features or flows a site has, wants to explore or walk through a site, is new to a project, or says \"explore before we test\". Also use for bare \"test [URL]\" commands with no further context. Do not use when Feature artifacts already exist or the user references specific known tests or bugs.
23helpmetest-proxy
Set up HelpMeTest proxy tunnels for local development testing. Use when user needs to test localhost, wants to substitute production URLs with local ports, or needs to route multiple services. Use when user says 'set up proxy', 'test localhost', 'tunnel to local', or before running tests against local development servers.
16