test
<tasklist_context> Use TaskList tool to check for existing tasks related to this work.
If a related task exists, note its ID and mark it in_progress with TaskUpdate when starting.
</tasklist_context>
<rules_context> Check for project testing rules:
Use Glob tool: .ruler/testing.md
If file exists: Read it for MUST/SHOULD/NEVER constraints on testing patterns, frameworks, and conventions.
If .ruler/ doesn't exist: Continue without rules — they're optional.
</rules_context>
<progress_context>
Use Read tool: docs/progress.md (first 50 lines)
Check for recently implemented features that need testing. </progress_context>
Test Workflow
Create or review test strategy. Optionally run test suite. Supports vitest and playwright primarily.
Process
Step 1: Detect Test Setup
Use Glob tool to find test framework config:
| Glob Pattern | Framework |
|---|---|
vitest.config.* |
vitest |
playwright.config.* |
playwright |
jest.config.* |
jest |
cypress.config.* |
cypress |
Step 1b: Verify Fail-Fast Configuration
Tests must fail fast. Don't waste time waiting.
Check playwright.config.ts has sensible timeouts:
timeout: 30_000(30s max per test)actionTimeout: 10_000(10s per action)expect.timeout: 5_000(5s for assertions)
If tests hit LLM APIs: Use tighter timeouts (5-10s). See ${CLAUDE_PLUGIN_ROOT}/references/llm-api-testing.md
Never:
- Set global timeout to minutes "just in case"
- Retry 5+ times to mask flaky tests
- Use arbitrary sleeps
Reference: ${CLAUDE_PLUGIN_ROOT}/agents/workflow/e2e-test-runner.md for full Playwright config
Step 2: Determine Intent
"What would you like to do?"
- Review strategy — Analyze current test coverage and approach
- Create strategy — Design test plan for a feature
- Run tests — Execute test suite
- Fix failing tests — Debug and fix
For "Review Strategy"
Analyze test coverage:
Use Glob tool: **/*.test.*, **/*.spec.* — count test files
Use Grep tool: Pattern coverage in vitest.config.*, playwright.config.* — check coverage config
Report:
- Number of test files
- Unit vs E2E balance
- Coverage gaps (if measurable)
- Missing test patterns
For "Create Strategy"
For the given feature:
-
Unit tests (vitest)
- Pure functions
- Component rendering
- Hooks behavior
-
Integration tests (vitest)
- Component interactions
- API mocking
-
E2E tests (playwright)
- Critical user flows
- Happy path + key error states
Output test plan:
## Test Strategy: [Feature]
### Unit Tests
- [ ] [Test case]: [What it verifies]
### Integration Tests
- [ ] [Test case]: [What it verifies]
### E2E Tests
- [ ] [Test case]: [What it verifies]
For "Run Tests"
Determine test type from context or ask:
- Unit/Integration tests → Run inline
- E2E tests → Run as background agent (prevents terminal crashes)
Unit/Integration (vitest/jest) — Run inline:
# Vitest
pnpm vitest run
# With coverage
pnpm vitest run --coverage
E2E tests (playwright/cypress) — Run as background agent:
Task Bash run_in_background: true: "Run playwright e2e tests and report results.
Commands:
pnpm playwright test
If tests fail, capture:
- Which tests failed
- Error messages
- Screenshot paths (if any)
Report summary when complete."
Why background agent for E2E:
- Playwright spawns browsers which can consume significant resources
- If tests hang or crash, your main session continues
- Verbose trace output doesn't fill your context
- You can continue working while tests run
Report results. If failures, offer to debug.
For "Fix Failing Tests"
Follow ${CLAUDE_PLUGIN_ROOT}/disciplines/systematic-debugging.md:
- Read error message carefully
- Understand what test expects
- Determine if test or code is wrong
- Fix at source
CRITICAL: Never blame "network issues" vaguely.
It is almost NEVER a network problem. Common actual causes:
| Error Pattern | Likely Cause | NOT the cause |
|---|---|---|
| "ECONNREFUSED" | Server not running, wrong URL | "Network issues" |
| "Timeout" | Slow operation OR payload too large | "Network issues" |
| "400 Bad Request" | Invalid payload format | "Network issues" |
| "500 Server Error" | Bug in your code | "Network issues" |
For LLM API failures: See ${CLAUDE_PLUGIN_ROOT}/references/llm-api-testing.md
Test Patterns
From ${CLAUDE_PLUGIN_ROOT}/references/testing-patterns.md:
Good tests:
- Test behavior, not implementation
- One assertion per concept
- Clear names describing what's tested
- Real code over mocks when possible
Bad tests:
- Testing mock behavior
- Vague names ("test1", "it works")
- Implementation details in assertions
LLM API Testing
When tests call LLM APIs (OpenRouter, OpenAI, Anthropic, etc.), follow the guidance in:
${CLAUDE_PLUGIN_ROOT}/references/llm-api-testing.md
Key points:
- Validate payloads with schema before sending
- Use fast models (-flash, -mini) for tests
- Set aggressive timeouts (5-10s)
- Never conclude "network issues" without evidence
<success_criteria> Test workflow is complete when:
- Test framework detected and configured
- Strategy documented (if creating strategy)
- Tests written and passing (if creating tests)
- Failing tests diagnosed and fixed (if fixing)
- All tests pass (
pnpm vitest runor equivalent) - Optional: test quality review offered (if tests written)
- Progress journal updated </success_criteria>
<test_quality_check> After tests pass, optionally offer a quality check:
"Tests are passing. Would you like me to review the test quality (assertion meaningfulness, isolation, flaky patterns)?"
If accepted, spawn the test-quality-engineer agent:
${CLAUDE_PLUGIN_ROOT}/agents/review/test-quality-engineer.md
Present findings and offer to fix any issues found. </test_quality_check>
<arc_log>
After completing this skill, append to the activity log.
See: ${CLAUDE_PLUGIN_ROOT}/references/arc-log.md
Entry: /arc:test — [X passing, Y failing]
</arc_log>