e2e
SKILL.md
E2E Testing
Principles (Always Active)
These apply whenever working with e2e tests, test failures, or test flakiness:
Failure Taxonomy
Every e2e failure is exactly one of:
A. Flaky (test infrastructure issue)
- Race conditions, timing-dependent assertions
- Stale selectors after UI changes
- Missing waits, incorrect wait targets
- Network timing, mock setup ordering
- Symptom: passes on retry, fails intermittently
B. Outdated (test no longer matches implementation)
- Test asserts old behavior that was intentionally changed
- Selectors reference removed/renamed elements
- API contract changed, test wasn't updated
- Symptom: consistent failure, app works correctly
C. Bug (implementation doesn't match spec)
- Test correctly asserts spec'd behavior, code is wrong
- Only classify as bug when a spec exists to validate against
- If no spec exists, classify as "unverified failure" and report to the user
Fix Rules by Category
Flaky fixes:
- Replace
waitForTimeoutwith auto-waiting locators - Replace brittle CSS selectors with
getByRole/getByLabel/getByTestId - Fix race conditions with
expect()web-first assertions - Fix mock/route setup ordering (before navigation)
- Never add arbitrary delays - fix the underlying wait
- Never weaken assertions to make flaky tests pass
- Never add retry loops around assertions - use the framework's built-in retry
Outdated fixes:
- Update test assertions to match current (correct) behavior
- Update selectors to match current DOM/API
- Never change source code - the implementation is correct, the test is stale
Bug fixes:
- Quote the spec section that defines expected behavior
- Fix the source code to match the spec
- Unit tests MUST exist before the fix is complete
- If unit tests exist, run them to confirm
- If unit tests don't exist, write them first (TDD)
- Never change e2e assertions to match buggy code
- Never change API contracts or interfaces without spec backing
- If no spec exists, ask the user: bug or outdated test?
Source Code Boundary
E2e test fixes must not change:
- Application logic or business rules
- API contracts, request/response shapes
- Database schemas or migrations
- Configuration defaults
The only exception: bug fixes where a spec explicitly defines the correct behavior and unit tests cover the fix.
Workflow (When Explicitly Running E2E)
Step 1: Discover Test Infrastructure
- Find e2e config:
playwright.config.ts,vitest.config.ts, or project-specific setup - Read
package.jsonfor the canonical e2e command - Check if dev server or Tilt environment is required and running
- Find spec files:
*.spec.md,docs/*.spec.md- source of truth for bug decisions
Step 2: Run Tests
Run with minimal reporter to avoid context overflow:
# Playwright
yarn playwright test --reporter=line
# Or project-specific
yarn test:e2e
If a filter is specified, apply it:
yarn playwright test --reporter=line -g "transfer"
yarn test:e2e -- --grep "transfer"
Parse failures into:
| Test | File | Error | Category |
|---|---|---|---|
login flow |
auth.spec.ts:42 |
timeout waiting for selector | TBD |
Step 3: Categorize
For each failure:
- Read the test file
- Read the source code it exercises
- Check for a corresponding spec file
- Assign category: flaky, outdated, bug, or unverified
Step 4: Fix by Category
Apply fixes following the Principles above, in order:
- Flaky - fix test infrastructure issues first (unblocks other tests)
- Outdated - update stale assertions
- Bug - fix with spec + unit test gate
Step 5: Re-run and Report
After all fixes, re-run the suite:
## E2E Results
**Run**: `yarn test:e2e` on <date>
**Result**: X/Y passed
### Fixed
- FLAKY: `auth.spec.ts:42` - replaced waitForTimeout with getByRole wait
- OUTDATED: `profile.spec.ts:88` - updated selector after header redesign
- BUG: `transfer.spec.ts:120` - fixed amount validation per SPEC.md#transfers
### Remaining Failures
- UNVERIFIED: `settings.spec.ts:55` - no spec, needs user decision
### Unit Tests Added
- `src/transfer.test.ts` - amount validation edge cases (covers BUG fix)
Weekly Installs
49
Repository
0xbigboss/claude-codeGitHub Stars
37
First Seen
Feb 13, 2026
Security Audits
Installed on
codex42
opencode40
claude-code40
gemini-cli40
github-copilot39
cursor39