e2e-test
End-to-End Application Testing
Pre-flight Check
1. Platform Check
agent-browser requires Linux, WSL, or macOS. Check the platform:
uname -s
LinuxorDarwin→ proceed- Anything else (e.g.,
MINGW,CYGWIN, or native Windows) → stop with:
"agent-browser only supports Linux, WSL, and macOS. It cannot run on native Windows. Please run this command from WSL or a Linux/macOS environment."
Stop execution if the platform is unsupported.
2. Frontend Check
Verify the application has a browser-accessible frontend. Check for:
- A
package.jsonwith a dev/start script serving a UI - Frontend framework files (pages/, app/, src/components/, index.html, etc.)
- Web server configuration
If no frontend is detected:
"This application doesn't appear to have a browser-accessible frontend. E2E browser testing requires a UI to visit. For backend-only or API testing, a different approach is needed."
Stop execution if no frontend is found.
3. agent-browser Installation
Check if agent-browser is installed:
agent-browser --version
If the command is not found, install it automatically:
npm install -g agent-browser
After installation (or if it was already installed), ensure the browser engine is set up:
agent-browser install --with-deps
The --with-deps flag installs system-level Chromium dependencies on Linux/WSL. On macOS it is harmless.
Verify installation succeeded:
agent-browser --version
If installation fails, stop with:
"Failed to install agent-browser. Please install it manually with
npm install -g agent-browser && agent-browser install --with-deps, then re-run this command."
Phase 1: Parallel Research
Launch three sub-agents simultaneously using the Task tool. All three run in parallel.
Sub-agent 1: Application Structure & User Journeys
Research this codebase thoroughly. Return a structured summary covering:
- How to start the application — exact commands to install dependencies and run the dev server, including the URL and port it serves on
- Authentication/login — if the app has protected routes, how to create a test account or log in (credentials from .env.example, seed data, or sign-up flow)
- Every user-facing route/page — each URL path and what it renders
- Every user journey — complete flows a user can take (e.g., "sign up → create profile → view public page"). For each journey, list the specific steps, interactions (clicks, form fills, navigation), and expected outcomes
- Key UI components — forms, modals, dropdowns, pickers, toggles, and other interactive elements that need testing
Be exhaustive. Testing will only cover what you identify here.
Sub-agent 2: Database Schema & Data Flows
Research this codebase's database layer. Read
.env.exampleto understand environment variables for database connections. DO NOT read.envdirectly. Return a structured summary covering:
- Database type and connection — what database is used (Postgres, MySQL, SQLite, etc.) and the environment variable name for the connection string (from .env.example)
- Full schema — every table, its columns, types, and relationships
- Data flows per user action — for each user-facing action (form submit, button click, etc.), document exactly what records are created, updated, or deleted and in which tables
- Validation queries — for each data flow, provide the exact query to verify records are correct after the action
Sub-agent 3: Bug Hunting
Analyze this codebase for potential bugs, issues, and code quality problems. Focus on:
- Logic errors — incorrect conditionals, off-by-one errors, missing null checks, race conditions
- UI/UX issues — missing error handling in forms, no loading states, broken responsive layouts, accessibility problems
- Data integrity risks — missing validation, potential orphaned records, incorrect cascade behavior
- Security concerns — SQL injection, XSS, missing auth checks, exposed secrets
Return a prioritized list with file paths and line numbers.
Wait for all three sub-agents to complete before proceeding.
Phase 2: Start the Application
Using Sub-agent 1's startup instructions:
- Install dependencies if needed
- Start the dev server in the background (e.g.,
npm run dev &) - Wait for the server to be ready
- Open the app with
agent-browser open <url>and confirm it loads - Take an initial screenshot:
agent-browser screenshot e2e-screenshots/00-initial-load.png
Phase 3: Create Task List
Using the user journeys from Sub-agent 1 and findings from Sub-agent 3, create a task (using TaskCreate) for each user journey. Each task should include:
- subject: The journey name (e.g., "Test profile creation flow")
- description: Steps to execute, expected outcomes, database records to verify, and any related bug findings from Sub-agent 3
- activeForm: Present continuous (e.g., "Testing profile creation flow")
Also create a final task: "Responsive testing across viewports."
Phase 4: User Journey Testing
For each task, mark it in_progress with TaskUpdate and execute the following.
4a. Browser Testing
Use the Vercel Agent Browser CLI for all browser interaction:
agent-browser open <url> # Navigate to a page
agent-browser snapshot -i # Get interactive elements with refs (@e1, @e2...)
agent-browser click @eN # Click element by ref
agent-browser fill @eN "text" # Clear field and type
agent-browser select @eN "option" # Select dropdown option
agent-browser press Enter # Press a key
agent-browser screenshot <path> # Save screenshot
agent-browser screenshot --annotate # Screenshot with numbered element labels
agent-browser set viewport W H # Set viewport (e.g., 375 812 for mobile)
agent-browser wait --load networkidle # Wait for page to settle
agent-browser console # Check for JS errors
agent-browser errors # Check for uncaught exceptions
agent-browser get text @eN # Get element text
agent-browser get url # Get current URL
agent-browser close # End session
Refs become invalid after navigation or DOM changes. Always re-snapshot after page navigation, form submissions, or dynamic content updates (modals, tabs, theme changes).
For each step in a user journey:
- Snapshot to get current refs
- Perform the interaction
- Wait for the page to settle
- Take a screenshot — save to a descriptive path under
e2e-screenshots/organized by journey (e.g.,e2e-screenshots/profile-creation/03-form-submitted.png) - Analyze the screenshot — use the Read tool to view the screenshot image. Check for visual correctness, UX issues, broken layouts, missing content, error states
- Check
agent-browser consoleandagent-browser errorsperiodically for JavaScript issues
Be thorough. Go through EVERY interaction, EVERY form field, EVERY button. The goal is that by the time this finishes, every part of the UI has been exercised and screenshotted.
4b. Database Validation
After any interaction that should modify data (form submits, deletions, updates):
- Query the database to verify records. Use the environment variable from Sub-agent 2's research for the connection string and the schema docs to know what to check.
- Postgres: use
psqldirectly — e.g.,psql "$DATABASE_URL" -c "SELECT theme FROM profiles WHERE username = 'testuser'" - SQLite: use
sqlite3directly — e.g.,sqlite3 db.sqlite "SELECT theme FROM profiles WHERE username = 'testuser'" - Other databases: write a small ad hoc script in the application's language, run it, then delete it
- Postgres: use
- Verify:
- Records created/updated/deleted as expected
- Values match what was entered in the UI
- Relationships between records are correct
- No orphaned or duplicate records
4c. Issue Handling
When an issue is found (UI bug, database mismatch, JS error):
- Document it: what was expected vs what happened, screenshot path, relevant DB query results
- Fix the code — make the correction directly
- Re-run the failing step to verify the fix worked
- Take a new screenshot confirming the fix
4d. Responsive Testing
For the responsive testing task, revisit key pages at these viewports:
- Mobile:
agent-browser set viewport 375 812 - Tablet:
agent-browser set viewport 768 1024 - Desktop:
agent-browser set viewport 1440 900
At each viewport, screenshot every major page. Analyze for layout issues, overflow, broken alignment, and touch target sizes on mobile.
After completing each journey, mark its task as completed with TaskUpdate.
Phase 5: Cleanup
After all testing is complete:
- Stop the dev server background process
- Close the browser session:
agent-browser close
Phase 6: Report
Text Summary (always output)
Present a concise summary:
## E2E Testing Complete
**Journeys Tested:** [count]
**Screenshots Captured:** [count]
**Issues Found:** [count] ([count] fixed, [count] remaining)
### Issues Fixed During Testing
- [Description] — [file:line]
### Remaining Issues
- [Description] — [severity: high/medium/low] — [file:line]
### Bug Hunt Findings (from code analysis)
- [Description] — [severity] — [file:line]
### Screenshots
All saved to: `e2e-screenshots/`
Markdown Export (ask first)
After the text summary, ask the user:
"Would you like me to export the full testing report to a markdown file? It includes per-journey breakdowns, all screenshot references, database validation results, and detailed findings — useful as context for follow-up fixes or GitHub issues."
If yes, write a detailed report to e2e-test-report.md in the project root containing:
- Full summary with stats
- Per-journey breakdown: steps taken, screenshots, database checks, issues found
- All issues with full details, fix status, and file references
- Bug hunt findings from the code analysis sub-agent
- Recommendations for any unresolved issues