agent-browser

SKILL.md

agent-browser Skill

CLI-based browser automation using vercel-labs/agent-browser. Provides browser control via Bash commands as an alternative to playwright-mcp.

When to Use agent-browser vs playwright-mcp

Feature playwright-mcp agent-browser
Invocation MCP tool calls (mcp__playwright__*) Bash commands (npx agent-browser)
Session persistence No (fresh context each time) Yes (--session, --session-name)
State save/load No Yes (state save/load)
Page diffing No Yes (diff snapshot, diff screenshot)
Annotated screenshots No Yes (screenshot --annotate)
Network interception No Yes (network route)
Cookie management Via browser_run_code JS Native (cookies set)
Multi-tab browser_tabs tool tab new, tab <n>
Element refs @ref from snapshot @ref from snapshot (same concept)
Best for Quick interactions, MCP-native workflows Advanced workflows, sessions, diffing, state persistence

Default: Use playwright-mcp for standard browser interactions (it's already loaded as an MCP server).

Use agent-browser when you need:

  • Session persistence across commands
  • Page diffing (snapshot or visual)
  • Annotated screenshots with element labels
  • Network interception / mocking
  • Saved authentication state
  • Cookie/storage manipulation
  • CDP connection to existing browser

Nix Environment Setup

This workspace uses Nix for reproducible tooling. The Nix-managed Playwright has Chromium at a different revision than agent-browser's bundled Playwright expects. The project-level agent-browser.json config file at the workspace root handles this automatically by setting executablePath to the Nix store Chromium.

No extra flags needed — just run npx agent-browser commands from the workspace root.

If the Chromium path changes (e.g., after niv update), update agent-browser.json:

{
  "executablePath": "/nix/store/<hash>-playwright-chromium/chrome-linux64/chrome"
}

Find the current path: ls -la /nix/store/*-playwright-browsers/chromium-*

Core Workflow

The basic agent-browser interaction loop:

# 1. Open a page
npx agent-browser open https://example.com

# 2. Get interactive elements (snapshot with refs)
npx agent-browser snapshot -i

# 3. Interact using @refs from snapshot
npx agent-browser click @e3
npx agent-browser fill @e5 "search term"

# 4. Re-snapshot to see updated state
npx agent-browser snapshot -i

# 5. Take screenshot for visual verification
npx agent-browser screenshot

# 6. Close when done
npx agent-browser close

Snapshot Options

npx agent-browser snapshot              # Full accessibility tree
npx agent-browser snapshot -i           # Interactive elements only (recommended)
npx agent-browser snapshot -c           # Compact (remove empty elements)
npx agent-browser snapshot -d 3         # Limit depth to 3 levels
npx agent-browser snapshot -s "#main"   # Scope to CSS selector

Screenshots

npx agent-browser screenshot                        # Viewport screenshot
npx agent-browser screenshot --full                  # Full page
npx agent-browser screenshot --annotate              # With numbered element labels
npx agent-browser screenshot path/to/save.png        # Save to specific path

MachinaMed Authentication

MachinaMed uses Google SSO. You must set JWT cookies before navigating to authenticated pages.

Method 1: State File (Recommended for agent-browser)

# 1. Generate auth cookies
(cd repos/dem2 && ./scripts/user_manager.py user token dbeal@numberone.ai --export-cookie)
# Output: export AUTH_HEADER="Cookie: access_token=eyJhbG...; refresh_token=eyJhbG..."

# 2. Open browser and set cookies before navigating
npx agent-browser open about:blank
npx agent-browser cookies set access_token "eyJhbG..." --domain localhost --path /
npx agent-browser cookies set refresh_token "eyJhbG..." --domain localhost --path /

# 3. Now navigate to authenticated page
npx agent-browser open http://localhost:3000/markers

Method 2: Session Persistence

# First time: set up auth in a named session
npx agent-browser --session-name machina open about:blank
npx agent-browser --session-name machina cookies set access_token "eyJhbG..." --domain localhost --path /
npx agent-browser --session-name machina cookies set refresh_token "eyJhbG..." --domain localhost --path /
npx agent-browser --session-name machina open http://localhost:3000/markers

# Later: reuse the session (cookies persist)
npx agent-browser --session-name machina open http://localhost:3000/markers

Method 3: State Save/Load

# Save authenticated state for reuse
npx agent-browser state save machina-auth.json

# Load it in a new session
npx agent-browser --state machina-auth.json open http://localhost:3000/markers

For Preview/Staging Environments

Replace localhost domain with the target:

npx agent-browser cookies set access_token "eyJhbG..." --domain preview-99.n1-machina.dev --path /
npx agent-browser cookies set refresh_token "eyJhbG..." --domain preview-99.n1-machina.dev --path /
npx agent-browser open https://preview-99.n1-machina.dev/markers

Session Management

# Isolated sessions (separate browser contexts)
npx agent-browser --session my-test open https://example.com
npx agent-browser --session my-test snapshot -i

# List active sessions
npx agent-browser session list

# Auto-persisting sessions (state saved/restored automatically)
npx agent-browser --session-name my-persistent open https://example.com

Page Diffing

Compare page states before and after changes:

Snapshot Diff (DOM/Accessibility Tree)

# Take baseline snapshot, make changes, then diff
npx agent-browser snapshot > baseline.txt
# ... make changes ...
npx agent-browser diff snapshot --baseline baseline.txt

# Scoped diff (only specific section)
npx agent-browser diff snapshot --selector "#main-content" --compact

Visual Diff (Pixel Comparison)

# Save baseline screenshot
npx agent-browser screenshot baseline.png

# After changes, compare
npx agent-browser diff screenshot --baseline baseline.png -o diff-result.png

# With threshold (0-1, default 0.2)
npx agent-browser diff screenshot --baseline baseline.png -t 0.1

URL Comparison

# Compare two URLs side by side
npx agent-browser diff url http://localhost:3000/markers https://preview-99.n1-machina.dev/markers

# Visual comparison
npx agent-browser diff url http://localhost:3000 https://preview-99.n1-machina.dev --screenshot

# Scoped comparison
npx agent-browser diff url <url1> <url2> --selector "#biomarker-list"

Network Interception

# Block requests (e.g., analytics)
npx agent-browser network route "**/analytics/**" --abort

# Mock API responses
npx agent-browser network route "**/api/v1/observations/**" --body '{"items": []}'

# View tracked requests
npx agent-browser network requests
npx agent-browser network requests --filter api

# Remove routes
npx agent-browser network unroute

Common Commands Reference

Navigation

npx agent-browser open <url>           # Navigate to URL
npx agent-browser back                 # Go back
npx agent-browser forward              # Go forward
npx agent-browser reload               # Reload page
npx agent-browser close                # Close browser

Interaction

npx agent-browser click @e1            # Click element by ref
npx agent-browser fill @e2 "text"      # Clear and fill input
npx agent-browser type @e2 "text"      # Type into element
npx agent-browser press Enter          # Press key
npx agent-browser select @e3 "option"  # Select dropdown option
npx agent-browser check @e4            # Check checkbox
npx agent-browser scroll down 500      # Scroll down 500px

Element Inspection

npx agent-browser get text @e1         # Get text content
npx agent-browser get html @e1         # Get innerHTML
npx agent-browser get value @e1        # Get input value
npx agent-browser get title            # Get page title
npx agent-browser get url              # Get current URL
npx agent-browser is visible @e1       # Check visibility

Semantic Locators

npx agent-browser find role button click           # Click first button
npx agent-browser find text "Submit" click         # Click by text
npx agent-browser find label "Email" fill "a@b.c"  # Fill by label
npx agent-browser find testid "login-btn" click    # Click by data-testid

Waiting

npx agent-browser wait "#element"                  # Wait for element
npx agent-browser wait 2000                        # Wait 2 seconds
npx agent-browser wait --text "Loading complete"   # Wait for text
npx agent-browser wait --load networkidle          # Wait for network idle

JavaScript Execution

npx agent-browser eval "document.title"
npx agent-browser eval "window.scrollTo(0, document.body.scrollHeight)"

Tabs

npx agent-browser tab                  # List tabs
npx agent-browser tab new <url>        # Open new tab
npx agent-browser tab 2                # Switch to tab 2
npx agent-browser tab close            # Close current tab

Cookies & Storage

npx agent-browser cookies                          # Get all cookies
npx agent-browser cookies set <name> <value>       # Set cookie
npx agent-browser cookies clear                    # Clear cookies
npx agent-browser storage local                    # Get localStorage
npx agent-browser storage local set <key> <value>  # Set localStorage

State Management

npx agent-browser state save <path>    # Save auth/session state
npx agent-browser state load <path>    # Load saved state
npx agent-browser state list           # List saved states
npx agent-browser state clear --all    # Clear all states

Debugging

npx agent-browser console              # View console messages
npx agent-browser errors               # View page errors
npx agent-browser highlight @e1        # Highlight element
npx agent-browser trace start          # Start recording trace
npx agent-browser trace stop           # Stop and save trace

Global Flags

--session <name>          # Isolated browser session
--session-name <name>     # Auto-persisting session
--state <path>            # Load storage state JSON
--headed                  # Show browser window (visible)
--json                    # JSON output format
--full, -f                # Full page screenshot
--annotate                # Annotated screenshot
--debug                   # Debug output
--config <path>           # Config file path

Environment Variables

Variable Purpose
AGENT_BROWSER_SESSION Session name
AGENT_BROWSER_SESSION_NAME Auto-save/restore session
AGENT_BROWSER_STATE Storage state JSON file
AGENT_BROWSER_PROFILE Persistent profile path
AGENT_BROWSER_DEFAULT_TIMEOUT Operation timeout (ms)

Related Skills

  • machina-ui: Required authentication setup for MachinaMed pages. Load machina-ui first to get JWT cookies, then use agent-browser for browser automation.
  • playwright-mcp: Default MCP-based browser automation. Use for standard interactions.
  • machina-docker: Development stack management (start/stop services before browser testing).
Weekly Installs
9
First Seen
13 days ago
Installed on
gemini-cli9
github-copilot9
codex9
kimi-cli9
amp9
cline9