agent-browser
Browser Automation with agent-browser
Use agent-browser when the browser itself is the work surface: local dev QA, auth flows, CRUD verification, scraping, screenshots, or reproducing UI issues.
On this machine:
- install or update with
vp install -g agent-browser@latest - verify with
agent-browser --version - run
agent-browser installafter install or upgrade to pin Chrome for Testing locally - prefer the PowerShell-safe patterns in references/windows-powershell.md
Fast Rules
- Give every task or scenario its own
AGENT_BROWSER_SESSION. Name it<run-id>-<scenario>. - Refs are session-local and page-local. Never reuse
@e...across sessions. Re-snapshot after navigation or DOM changes. - In PowerShell, quote refs:
agent-browser click '@e1'. Do not use Bash-style&&. - Use this locator ladder: fresh refs, re-snapshot, semantic
find, CSS selector, keyboard fallback. - For dialogs, popovers, comboboxes, sheets, and menus: open first, then snapshot the live subtree before the next action.
- For mutations, clear the network log, perform the action, wait, inspect requests, then capture URL, title, errors, and a screenshot.
- Prefer URL or element waits. Use
wait --load networkidleonly on pages whose activity actually settles. - Run interactive QA serially on one machine. Parallel sessions are fine for light scraping, but they add noise to mutation-heavy testing.
- Use
--session-nameorAGENT_BROWSER_SESSION_NAMEonly when you intentionally want cookies and localStorage restored across runs.
PowerShell Quick Start
$runId = "probe-$(Get-Date -Format yyyyMMdd-HHmmss)"
$env:AGENT_BROWSER_SESSION = "$runId-login"
agent-browser open https://example.com/login
agent-browser snapshot -i --json
agent-browser fill '@e1' 'user@example.com'
agent-browser fill '@e2' 'password'
agent-browser click '@e3'
agent-browser wait 1500
agent-browser get url
agent-browser screenshot
agent-browser errors
Standard Workflow
- Open the route or page you actually want to validate.
- Take
snapshot -i --jsonif the next action depends on refs. - Interact using fresh refs.
- Re-snapshot after anything that changes the DOM or active route.
- For mutations, verify with
network requests,errors,console, and a screenshot. - Restore baseline state before leaving the scenario unless the data is intentionally disposable.
Local Dev Apps
For local web-app testing, use agent-browser as the primary browser executor and keep the app bootstrap outside the browser.
- start the app first, then test it
- use one session per scenario such as
cresa-ab-<run-id>-events - capture at least one screenshot per scenario
- record final URL, title, and page errors
- validate route correctness before treating a blank page as an agent-browser failure
Use references/local-dev-e2e.md for the full pattern.
Locator Strategy
Start with refs from a fresh snapshot. If the click or fill no-ops:
- re-snapshot
- retry with the new ref
- use
find role,find text, orfind label - fall back to a CSS selector when the semantic target is stable
- if a dialog submit still no-ops, try
focuspluspress Enter
Use references/snapshot-refs.md and references/evidence-triage.md for the deeper workflow.
Auth Choices
Use the simplest auth pattern that fits:
- one-off reuse of an already logged-in browser: state export or auto-connect
- recurring automation:
--profileor--session-name - secure stored credentials: auth vault
Details live in references/authentication.md.
Deep-Dive References
| Reference | Use When |
|---|---|
| references/windows-powershell.md | PowerShell quoting, session env vars, and Windows shell behavior |
| references/local-dev-e2e.md | Local app smoke tests, CRUD passes, mobile passes, and artifact capture |
| references/evidence-triage.md | Verifying mutations, diagnosing no-op clicks, and collecting evidence |
| references/snapshot-refs.md | Ref lifecycle, dynamic widgets, and stale-ref recovery |
| references/session-management.md | Session isolation, persistence, and cleanup |
| references/authentication.md | Login flows, state reuse, OAuth, and 2FA handling |
| references/commands.md | Command reference and less-common CLI options |
| references/video-recording.md | Recording workflows for demos and bug reports |
| references/profiling.md | Performance and trace capture |
| references/proxy-support.md | Proxy and geo-testing configuration |
Templates
| Template | Description |
|---|---|
| templates/powershell-session.ps1 | Create an isolated PowerShell session and run a small command block |
| templates/local-dev-smoke.ps1 | Open a local app route, snapshot it, and capture evidence |
| templates/crud-scenario.md | Reusable CRUD scenario checklist with restore steps |
| templates/form-automation.sh | Basic form filling flow |
| templates/authenticated-session.sh | Login once and reuse saved state |
| templates/capture-workflow.sh | Content extraction with screenshots |
More from mylesmcook/mcook-skills
adversarial-review
Use this skill when you need a serious code review, diff review, or implementation-plan review from independent reviewers. In Codex hosts, prefer a fresh Codex subagent for the Codex reviewer; otherwise use the Codex, Claude Code, and Gemini reviewer paths when available. Return a PASS, CONTESTED, or REJECT verdict.
13subagent-driven-development
Use after an implementation plan is approved to execute mostly independent tasks through fresh subagents with scoped context, harness-aware model routing, proportional review gates, and mandatory controller verification.
10brainerd
>
10simple-code
Reduce incidental complexity in code and design. Use when shaping APIs, module boundaries, refactors, tests, naming, and architecture tradeoffs with a bias toward concrete, local, reversible solutions.
7git-it-out
Use this skill when the user explicitly wants final end-of-session closeout and no more branch or PR limbo: proper verification, proper commits, main/mainline landing, push, repo-native merge/release/deploy/publish steps, tracker updates, Entire/checkpoint handling when configured, and a concise handoff. Reach for it on prompts like 'git it out', 'get it out', 'ship this', 'I'm done', 'I'm going to bed', 'take this off my plate', 'finish the session', or 'get this into production'. Do not use it for greenfield implementation, open-ended debugging, broad refactors, or inventing a release process from scratch.
7laws-of-taste
>-
6