agent-browser
SKILL.md
When to use this skill
Use this skill whenever the user wants to:
- Automate browser interactions via CLI commands
- Use browser automation for AI agents
- Navigate websites and interact with pages using command-line tools
- Use refs-based element selection for deterministic automation
- Integrate browser automation into AI agent workflows
- Capture snapshots of web pages with accessibility trees
- Fill forms, click elements, and extract content via CLI
- Use semantic locators for more reliable element selection
- Work with browser automation in agent mode with JSON output
- Manage multiple browser sessions
- Debug browser automation with headed mode
- Use authenticated sessions with custom headers
- Connect to existing browsers via CDP
- Stream browser viewport for live preview
How to use this skill
This skill is organized to match the agent-browser official documentation structure (https://github.com/vercel-labs/agent-browser/blob/main/README.md). When working with agent-browser:
-
Install agent-browser:
- Run
npx agent-browser setupor useuvxfor one-off commands.
- Run
-
Quick Start:
npx agent-browser open https://example.comnpx agent-browser snapshot(to see the page and element refs)npx agent-browser click @e1(using a ref from the snapshot)
-
Core Workflow:
- Snapshot First: Always run
snapshotto get the accessibility tree and element references (@e1,@e2, etc.). - Interact with Refs: Use
@e[number]to interact with elements deterministically.
- Snapshot First: Always run
-
Agent Mode:
- Use the
--jsonflag to get responses in structured JSON format, ideal for internal processing.
- Use the
Command Reference Summary
| Command | Description |
|---|---|
open <url> |
Navigate to a URL |
snapshot |
Get the page content and accessibility tree with refs |
click <selector> |
Click an element (e.g., @e1, button, .link) |
fill <selector> <text> |
Fill an input field |
eval |
Execute JavaScript in the page context |
get <info> |
Retrieve info (title, url, content, etc.) |
wait <condition> |
Wait for specific states or elements |
Multi-Step Examples
Filling a Search Form
# 1. Open the page
npx agent-browser open https://google.com
# 2. Get the snapshot to find the search box ref
npx agent-browser snapshot
# 3. Fill the search box (assume it was @e5)
npx agent-browser fill @e5 "agent-browser vercel"
# 4. Press Enter
npx agent-browser press Enter
Extracting Data with Agent Mode
npx agent-browser --json open https://news.ycombinator.com
npx agent-browser --json snapshot
Best Practices
- Use Refs: Prefer refs (@e1, @e2) over traditional selectors for deterministic automation.
- Snapshot First: Always snapshot before interacting with elements to get refs.
- Agent Mode: Use
--jsonflag for machine-readable output in agent mode. - Session Management: Use
--sessionto maintain state across commands. - Interactive Snapshot: Use
-iflag for interactive snapshot selection. - Semantic Locators: Use semantic locators (role/name) when refs are not available.
Resources
- GitHub Repository: https://github.com/vercel-labs/agent-browser
- Official README: https://github.com/vercel-labs/agent-browser/blob/main/README.md
Weekly Installs
1
Repository
mileycy516-stack/skillsFirst Seen
1 day ago
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1