agent-browser

Installation

SKILL.md

Agent Browser Skill

Fast browser automation CLI for AI agents. Use this to verify web UI changes, test interactions, and capture screenshots.

When to Use

Verify visual changes after modifying frontend code
Test form interactions and user flows
Capture screenshots for documentation or debugging
Inspect rendered HTML/accessibility tree
Debug why something isn't working in the browser

Core Workflow

1. Open a URL

agent-browser open http://localhost:4321/

2. Take a Snapshot

The snapshot command returns an accessibility tree with refs (@e1, @e2, etc.) that you can use for interactions:

agent-browser snapshot -i    # -i = interactive elements only (recommended)
agent-browser snapshot -c    # -c = compact (removes empty structural elements)
agent-browser snapshot -i -c # Both flags work together

3. Interact Using Refs

Use the @ref values from the snapshot to interact with elements:

agent-browser click @e5           # Click element
agent-browser fill @e3 "hello"    # Clear and type
agent-browser type @e3 "world"    # Type without clearing
agent-browser select @e7 "option" # Select dropdown
agent-browser check @e9           # Check checkbox

4. Screenshot

agent-browser screenshot              # Viewport only
agent-browser screenshot --full       # Full page
agent-browser screenshot output.png   # Save to file

ALWAYS check the image size before attempting to load. If it is larger than 2MB, process it using a tool such as sips or ImageMagick to reduce the size. Large files cannot be loaded by your tools.

Common Patterns

Form Verification

agent-browser open http://localhost:4321/_emdash/api/setup/dev-bypass?redirect=/_emdash/admin/content/posts/new
agent-browser snapshot -i   # Check what fields are visible
agent-browser fill @e3 "Test Title"
agent-browser fill @e5 "Test content here"
agent-browser click @e7     # Save button

Get Element Info

agent-browser get text @e1      # Get text content
agent-browser get html @e1      # Get inner HTML
agent-browser get value @e2     # Get input value
agent-browser get url           # Current URL
agent-browser get title         # Page title
agent-browser get count "button" # Count matching elements

Check Element State

agent-browser is visible @e1
agent-browser is enabled @e2
agent-browser is checked @e3

Find by Role/Label

agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@example.com"
agent-browser find placeholder "Search..." type "query"

Sessions

Sessions keep browser state (cookies, storage) between commands:

# Use named session (persists until closed)
agent-browser --session mytest open http://localhost:4321
agent-browser --session mytest snapshot -i

# Or set via environment
export AGENT_BROWSER_SESSION=mytest
agent-browser open http://localhost:3000

Useful Options

Option	Description
`--session <name>`	Isolated browser session
`--headed`	Show browser window (not headless)
`--json`	JSON output for programmatic use
`--full`	Full page screenshot
`-i`	Snapshot: interactive elements only
`-c`	Snapshot: compact output
`-d <n>`	Snapshot: limit tree depth

Debugging

agent-browser --headed open http://localhost:3000  # See what's happening
agent-browser console                               # View console logs
agent-browser errors                                # View page errors
agent-browser highlight @e5                         # Highlight element
agent-browser eval "document.title"                 # Run JS

Tips

Always snapshot first - Get refs before interacting
Use -i flag - Interactive-only snapshots are much cleaner
Wait when needed - Use wait <ms> or wait <selector> after actions that trigger loading
Sessions for auth - Use named sessions to persist login state
Headed for debugging - Use --headed when things aren't working as expected

Related skills

More from emdash-cms/emdash

Installs

Repository

emdash-cms/emdash

GitHub Stars

10.3K

First Seen

Apr 2, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn