agent-browser
SKILL.md
Agent Browser Testing Skill
Browser automation and end-to-end testing using Vercel's agent-browser CLI. Uses ref-based element targeting for reliable, AI-friendly browser interaction.
Quick Decision Tree
What do you need?
│
├─ Take a screenshot of a page?
│ └─ agent-browser open [url] && agent-browser screenshot
│
├─ Fill out a form?
│ └─ open → snapshot -i → fill @ref → click @submit → snapshot
│
├─ Test a login flow?
│ └─ See references/authentication.md
│
├─ Run an E2E test?
│ └─ See references/testing-patterns.md
│
├─ Scrape page content?
│ └─ agent-browser open [url] && agent-browser snapshot -i
│
└─ Debug element targeting?
└─ agent-browser snapshot -i --format json
Installation
# Install agent-browser globally
npm install -g agent-browser
# Install browser dependencies (Chromium)
agent-browser install
# Verify installation
agent-browser --version
Core Concept: Ref-Based Targeting
Agent-browser uses refs (like @e1, @e2, @e3) to identify interactive elements on the page. These refs are assigned when you take a snapshot.
# Take a snapshot with interactive elements labeled
agent-browser snapshot -i
# Output shows refs:
# @e1: [button] "Sign In"
# @e2: [input] Email field
# @e3: [input] Password field
# @e4: [button] "Submit"
# Use refs to interact
agent-browser click @e1
agent-browser fill @e2 "user@example.com"
Important: Refs are session-specific and invalidate when the page changes. Always re-snapshot after navigation or DOM updates.
Essential Workflow
# 1. Open the target URL
agent-browser open https://example.com
# 2. Take a snapshot to see the page and get refs
agent-browser snapshot -i
# 3. Interact with elements using refs
agent-browser click @e1
agent-browser fill @e2 "test value"
# 4. Take another snapshot to verify changes
agent-browser snapshot -i
Common Commands Quick Reference
Navigation
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser refresh # Reload page
Snapshots
agent-browser snapshot # Text snapshot
agent-browser snapshot -i # With interactive refs
agent-browser snapshot --format json # JSON output
agent-browser screenshot [path] # Save screenshot
Interaction
agent-browser click @ref # Click element
agent-browser fill @ref "value" # Fill input field
agent-browser select @ref "option" # Select dropdown option
agent-browser hover @ref # Hover over element
agent-browser press Enter # Press keyboard key
Semantic Locators
agent-browser find role button "Submit" # Find by ARIA role
agent-browser find text "Welcome" # Find by visible text
agent-browser find label "Email" # Find by label
Waiting
agent-browser wait visible @ref # Wait for element visible
agent-browser wait hidden @ref # Wait for element hidden
agent-browser wait network # Wait for network idle
agent-browser wait time 2000 # Wait milliseconds
Session Management
agent-browser session save mystate # Save browser state
agent-browser session load mystate # Load saved state
agent-browser session list # List saved sessions
agent-browser close # Close browser
Security Notes
Never commit these files:
*.state- Browser session state files contain cookiesagent-browser-profile/- Profile directories with credentials- Screenshots that may contain sensitive data
Add to .gitignore:
*.state
agent-browser-profile/
.agent-browser/
screenshots/
Integration with Other Skills
With Parallel Research
# Research a topic, then verify claims on websites
parallel_research.py chat "Find pricing for Acme Corp"
# Then use agent-browser to verify on their actual pricing page
agent-browser open https://acme.com/pricing
agent-browser snapshot -i
With Screenshot Comparison
# Take baseline screenshots for visual regression
agent-browser open https://myapp.com
agent-browser screenshot baseline.png
# After changes, compare
agent-browser screenshot current.png
# Use image comparison tool
With Form Data from Sheets
# Load test data from Google Sheets, run form tests
import subprocess
test_data = get_sheet_data("Form Test Cases")
for row in test_data:
subprocess.run(["agent-browser", "fill", "@email", row["email"]])
subprocess.run(["agent-browser", "fill", "@password", row["password"]])
subprocess.run(["agent-browser", "click", "@submit"])
Files in This Skill
references/commands.md- Full command referencereferences/authentication.md- Login flow patternsreferences/testing-patterns.md- E2E test workflowsreferences/snapshot-workflow.md- Ref system deep divescripts/browser_test.py- Python automation wrapper
Example: Complete Form Test
# Open the registration page
agent-browser open https://example.com/register
# Get element refs
agent-browser snapshot -i
# Fill the form (refs from snapshot output)
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "SecurePass123!"
agent-browser select @e4 "United States"
agent-browser click @e5 # Terms checkbox
agent-browser click @e6 # Submit button
# Wait for navigation and verify
agent-browser wait network
agent-browser snapshot -i
# Take confirmation screenshot
agent-browser screenshot registration-success.png
Troubleshooting
Element not found:
- Re-run
snapshot -ito get fresh refs - Use semantic locators:
agent-browser find text "Submit" - Check if element is in an iframe
Page not loading:
- Increase timeout:
agent-browser open <url> --timeout 30000 - Wait for network:
agent-browser wait network
Session expired:
- Save state before tests:
agent-browser session save backup - Load state to restore:
agent-browser session load backup
Weekly Installs
119
Repository
casper-studios/…ketplaceGitHub Stars
9
First Seen
Feb 24, 2026
Security Audits
Installed on
opencode119
github-copilot119
codex119
kimi-cli119
gemini-cli119
cursor119