agent-e2e
E2E Test Discovery & Execution
You're here to discover E2E test scenarios, record them as YAML, or execute existing test cases.
Core Principle: Semantic Selectors
The problem: Traditional selectors (testId, CSS classes) break when UI changes.
The solution: Use semantic selectors based on accessibility properties. These match what users see.
# ❌ Fragile
{{dataTestId: "submit-btn"}}
# ✅ Stable
{{role: button, name: "Submit"}}
Phase 1: Discovering Test Scenarios
Don't mechanically scan code/docs/UI. Use risk-driven discovery.
High-Risk Indicators
When exploring a project, prioritize areas with these characteristics:
| Indicator | Why It's Risky | Example |
|---|---|---|
| Multi-field forms | Validation, edge cases, state | Registration, checkout |
| Multi-step flows | State carries across steps | Wizards, onboarding |
| Authentication boundaries | Permissions, sessions | Login, role-based access |
| External integrations | Dependencies can fail | Payment, third-party APIs |
| Data mutation | Side effects matter | Create, update, delete |
Discovery Strategy
- Start from risk, not from code structure
- Use multiple sources (code, docs, browser) to verify each risk
- Stop when confident, not when sources are exhausted
Ask yourself: "What would break that users would notice?" Start there.
Phase 2: Recording Test Cases
The Snapshot-Refs Pattern
# 1. Open page
agent-browser open http://localhost:3000 --headed
# 2. Snapshot to discover elements
agent-browser snapshot --json
# Returns: {"refs": {"e1": {"name": "Email", "role": "textbox"}, ...}}
# 3. Interact using refs
agent-browser fill @e1 "test@example.com"
agent-browser click @e2
# 4. Re-snapshot after DOM changes (refs get reassigned!)
agent-browser snapshot --json
Critical: Always re-snapshot after any action that changes DOM. Refs are temporary.
Recording as YAML
Store semantic selectors, not refs:
steps:
- desc: Submit login form
commands:
- agent-browser fill {{role: textbox, name: "Email"}} "{{env.EMAIL}}"
- agent-browser click {{role: button, name: "Sign In"}}
expect:
- url_contains: /dashboard
Phase 3: Organizing Test Cases
Use complexity gradient to enable fast root-cause analysis:
foundations/ → flows/ → compositions/
(atomic ops) (journeys) (multi-session)
login.yaml checkout.yaml two-user-chat.yaml
logout.yaml onboarding.yaml order-and-fulfill.yaml
navigate.yaml password-reset.yaml
Why This Structure
When a composition fails:
- Check which flow failed
- Check which foundation in that flow failed
- Fix the foundation → everything above it recovers
Foundations = building blocks (high reuse, must be stable) Flows = complete user journeys (compose foundations) Compositions = multi-session/multi-user scenarios (compose flows)
Deciding the Category
Ask: "Can this be broken into smaller independent tests?"
- Yes → It's a flow or composition
- No → It's a foundation
Phase 4: Executing Test Cases
Intent-Driven Execution
YAML describes intent + constraints. You (the agent) fill in execution details.
# YAML says WHAT and WHEN
- desc: Close popup if it appears
commands:
- agent-browser click {{role: button, name: "Close"}}
optional: true # Constraint: this can fail
# You decide HOW
# 1. Snapshot to check if button exists
# 2. If exists → click
# 3. If not → skip (because optional: true)
# 4. Record your decision for debugging
Execution Rules
- For each step: snapshot → match selector → execute → verify expect
- For optional steps: skip if element not found, but record the decision
- For failures: report which selector didn't match and what was found instead
- Always: close browser when done
Handling Complex Scenarios
Conditional steps (optional: true):
- desc: Dismiss cookie banner
commands:
- agent-browser click {{role: button, name: /Accept|Close|×/}}
optional: true
You try the selector. If not found, skip and continue.
Data extraction:
- desc: Capture order ID from page
extract:
from: {{role: heading, name: /Order #\d+/}}
pattern: "Order #(\\d+)"
save_as: order_id
You find the element, extract the value, save it for later steps.
Multi-session (compositions):
sessions:
- name: alice
lifecycle: persistent
- name: bob
lifecycle: persistent
steps:
- desc: Alice sends message
session: alice
commands: [...]
- desc: Verify Bob received it
session: bob
expect:
- element_visible: { { role: paragraph, name: /Hello/ } }
You manage separate browser sessions and switch between them.
YAML Format Reference
Minimal structure:
id: case-name
title: Human-readable title
category: foundation # foundation | flow | composition
description: |
What this tests and why it matters.
parameters:
base_url: http://localhost:3000
steps:
- desc: Step description
commands:
- agent-browser <action> {{selector}} [args]
expect:
- condition: value
optional: false # default
exploration:
status: validated # draft | explored | validated
See FORMATS.md for complete specification.
Common Issues
Empty snapshot: Page hasn't loaded or lacks accessibility. Wait longer or check framework.
Refs changed: Expected. Always re-snapshot before interacting.
Multiple matches: Use index: {{role: button, name: "Delete", index: 0}}
Static routing: /docs may need /docs.html. Note in your test case.
Rate limiting (429): Public sites limit unauthenticated requests. Consider:
- Adding delays between steps (
agent-browser wait 2000) - Using authenticated sessions when available
- Recording on local dev instances when possible
Checklist
When discovering:
- Identified high-risk areas (forms, multi-step, auth, integrations)
- Verified risks from multiple sources
- Prioritized by user impact
When recording:
- Used semantic selectors, not refs
- Re-snapshot after DOM changes
- Added expect conditions for verification
When organizing:
- Categorized by complexity (foundation/flow/composition)
- Foundations are atomic and reusable
- Compositions reference flows, flows reference foundations
When executing:
- Followed intent, applied judgment for details
- Recorded decisions for debugging
- Closed all browser sessions