e2e
E2E Testing
Failure Taxonomy
Every e2e failure is exactly one of:
A. Flaky (test infrastructure issue)
- Race conditions, timing-dependent assertions, stale selectors, missing waits
- Symptom: passes on retry, fails intermittently
B. Outdated (test no longer matches implementation)
- Test asserts old behavior that was intentionally changed; selectors reference removed elements
- Symptom: consistent failure, app works correctly
C. Bug (implementation doesn't match spec)
- Test correctly asserts spec'd behavior, code is wrong
- Only classify as bug when a spec exists to validate against
- If no spec exists, classify as "unverified failure" and report to the user
Fix Rules by Category
Flaky fixes:
- Replace
waitForTimeoutwith auto-waiting locators - Replace brittle CSS selectors with
getByRole/getByLabel/getByTestId - Fix race conditions with
expect()web-first assertions - Fix mock/route setup ordering (before navigation)
- Never add arbitrary delays - fix the underlying wait
- Never add retry loops around assertions - use the framework's built-in retry
Outdated fixes:
- Update test assertions to match current (correct) behavior
- Update selectors to match current DOM/API
- Never change source code - the implementation is correct, the test is stale
Bug fixes:
- Quote the spec section that defines expected behavior
- Fix the source code to match the spec
- Unit tests MUST exist before the fix is complete — write them first if missing (TDD)
- Never change e2e assertions to match buggy code
- Never change API contracts or interfaces without spec backing
- If no spec exists, ask the user: bug or outdated test?
Source Code Boundary
E2e test fixes must not change application logic, API contracts, database schemas, or configuration defaults. The only exception: bug fixes where a spec explicitly defines the correct behavior and unit tests cover the fix.
Workflow
Step 1: Discover Test Infrastructure
- Find e2e config:
playwright.config.ts,vitest.config.ts, or project-specific setup - Read
package.jsonfor the canonical e2e command - Check if dev server or Tilt environment is required and running
- Find spec files:
*.spec.md,docs/*.spec.md- source of truth for bug decisions
Step 2: Run Tests
# Playwright
yarn playwright test --reporter=line
# Or project-specific
yarn test:e2e
Parse failures into:
| Test | File | Error | Category |
|---|---|---|---|
login flow |
auth.spec.ts:42 |
timeout waiting for selector | TBD |
Step 3: Categorize
For each failure: read the test file, read the source code it exercises, check for a corresponding spec file, assign category (flaky / outdated / bug / unverified).
Step 4: Fix by Category
Apply fixes in order: flaky first (unblocks other tests), then outdated, then bug.
Step 5: Re-run and Report
## E2E Results
**Run**: `yarn test:e2e` on <date>
**Result**: X/Y passed
### Fixed
- FLAKY: `auth.spec.ts:42` - replaced waitForTimeout with getByRole wait
- OUTDATED: `profile.spec.ts:88` - updated selector after header redesign
- BUG: `transfer.spec.ts:120` - fixed amount validation per SPEC.md#transfers
### Remaining Failures
- UNVERIFIED: `settings.spec.ts:55` - no spec, needs user decision
### Unit Tests Added
- `src/transfer.test.ts` - amount validation edge cases (covers BUG fix)
See testing-best-practices for async handling, flake classification, and preflight check patterns.