agent-browser

SKILL.md

Browser Automation with agent-browser

Core Workflow

Every browser automation follows this pattern:

  1. Navigate: agent-browser open <url>
  2. Snapshot: agent-browser snapshot -i (get element refs like @e1, @e2)
  3. Interact: Use refs to click, fill, select
  4. Re-snapshot: After navigation or DOM changes, get fresh refs
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"

agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i  # Check result

Essential Commands

# Navigation
agent-browser open <url>              # Navigate
agent-browser close                   # Close browser

# Snapshot
agent-browser snapshot -i             # Interactive elements with refs (recommended)
agent-browser snapshot -i -C          # Include cursor-interactive elements

# Interaction (use @refs from snapshot)
agent-browser click @e1               # Click
agent-browser fill @e2 "text"         # Clear and type
agent-browser select @e1 "option"     # Select dropdown
agent-browser check @e1               # Checkbox
agent-browser press Enter             # Key press

# Wait
agent-browser wait @e1                # Wait for element
agent-browser wait --load networkidle # Wait for network idle

# Capture
agent-browser screenshot              # Screenshot
agent-browser screenshot --full       # Full page

Ref Lifecycle (Important)

Refs (@e1, @e2, etc.) are invalidated when the page changes. Always re-snapshot after clicking links/buttons that navigate, form submissions, or dynamic content loading.

Deep-Dive Documentation

Reference When to Use
references/commands.md Full command reference with all options
references/snapshot-refs.md Ref lifecycle, invalidation rules, troubleshooting
references/session-management.md Parallel sessions, state persistence, concurrent scraping
references/authentication.md Login flows, OAuth, 2FA handling, state reuse
references/video-recording.md Recording workflows for debugging and documentation
references/proxy-support.md Proxy configuration, geo-testing, rotating proxies
references/common-patterns.md Form submission, auth, data extraction, parallel sessions, iOS simulator
references/semantic-locators.md Alternatives to refs
references/javascript-evaluation.md eval rules, --stdin/-b explanation

Ready-to-Use Templates

Template Description
templates/form-automation.sh Form filling with validation
templates/authenticated-session.sh Login once, reuse state
templates/capture-workflow.sh Content extraction with screenshots
Weekly Installs
12
First Seen
Feb 17, 2026
Installed on
cursor12
opencode12
github-copilot11
codex11
kimi-cli11
gemini-cli11