ruvnet-browser
SKILL.md
Browser Automation Skill
Web browser automation using agent-browser with AI-optimized snapshots. Reduces context by 93% using element refs (@e1, @e2) instead of full DOM.
Core Workflow
# 1. Navigate to page
agent-browser open <url>
# 2. Get accessibility tree with element refs
agent-browser snapshot -i # -i = interactive elements only
# 3. Interact using refs from snapshot
agent-browser click @e2
agent-browser fill @e3 "text"
# 4. Re-snapshot after page changes
agent-browser snapshot -i
Quick Reference
Navigation
| Command | Description |
|---|---|
open <url> |
Navigate to URL |
back |
Go back |
forward |
Go forward |
reload |
Reload page |
close |
Close browser |
Snapshots (AI-Optimized)
| Command | Description |
|---|---|
snapshot |
Full accessibility tree |
snapshot -i |
Interactive elements only (buttons, links, inputs) |
snapshot -c |
Compact (remove empty elements) |
snapshot -d 3 |
Limit depth to 3 levels |
screenshot [path] |
Capture screenshot (base64 if no path) |
Interaction
| Command | Description |
|---|---|
click <sel> |
Click element |
fill <sel> <text> |
Clear and fill input |
type <sel> <text> |
Type with key events |
press <key> |
Press key (Enter, Tab, etc.) |
hover <sel> |
Hover element |
select <sel> <val> |
Select dropdown option |
check/uncheck <sel> |
Toggle checkbox |
scroll <dir> [px] |
Scroll page |
Get Info
| Command | Description |
|---|---|
get text <sel> |
Get text content |
get html <sel> |
Get innerHTML |
get value <sel> |
Get input value |
get attr <sel> <attr> |
Get attribute |
get title |
Get page title |
get url |
Get current URL |
Wait
| Command | Description |
|---|---|
wait <selector> |
Wait for element |
wait <ms> |
Wait milliseconds |
wait --text "text" |
Wait for text |
wait --url "pattern" |
Wait for URL |
wait --load networkidle |
Wait for load state |
Sessions
| Command | Description |
|---|---|
--session <name> |
Use isolated session |
session list |
List active sessions |
Selectors
Element Refs (Recommended)
# Get refs from snapshot
agent-browser snapshot -i
# Output: button "Submit" [ref=e2]
# Use ref to interact
agent-browser click @e2
CSS Selectors
agent-browser click "#submit"
agent-browser fill ".email-input" "test@test.com"
Semantic Locators
agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "test@test.com"
agent-browser find testid "login-btn" click
Examples
Login Flow
agent-browser open https://example.com/login
agent-browser snapshot -i
agent-browser fill @e2 "user@example.com"
agent-browser fill @e3 "password123"
agent-browser click @e4
agent-browser wait --url "**/dashboard"
Form Submission
agent-browser open https://example.com/contact
agent-browser snapshot -i
agent-browser fill @e1 "John Doe"
agent-browser fill @e2 "john@example.com"
agent-browser fill @e3 "Hello, this is my message"
agent-browser click @e4
agent-browser wait --text "Thank you"
Data Extraction
agent-browser open https://example.com/products
agent-browser snapshot -i
# Iterate through product refs
agent-browser get text @e1 # Product name
agent-browser get text @e2 # Price
agent-browser get attr @e3 href # Link
Multi-Session (Swarm)
# Session 1: Navigator
agent-browser --session nav open https://example.com
agent-browser --session nav state save auth.json
# Session 2: Scraper (uses same auth)
agent-browser --session scrape state load auth.json
agent-browser --session scrape open https://example.com/data
agent-browser --session scrape snapshot -i
Integration with Claude Flow
MCP Tools
All browser operations are available as MCP tools with browser/ prefix:
browser/openbrowser/snapshotbrowser/clickbrowser/fillbrowser/screenshot- etc.
Memory Integration
# Store successful patterns
npx @claude-flow/cli memory store --namespace browser-patterns --key "login-flow" --value "snapshot->fill->click->wait"
# Retrieve before similar task
npx @claude-flow/cli memory search --query "login automation"
Hooks
# Pre-browse hook (get context)
npx @claude-flow/cli hooks pre-edit --file "browser-task.ts"
# Post-browse hook (record success)
npx @claude-flow/cli hooks post-task --task-id "browse-1" --success true
Tips
- Always use snapshots - They're optimized for AI with refs
- Prefer
-iflag - Gets only interactive elements, smaller output - Use refs, not selectors - More reliable, deterministic
- Re-snapshot after navigation - Page state changes
- Use sessions for parallel work - Each session is isolated
Weekly Installs
1
Source
smithery.ai/ski…/browserFirst Seen
14 days ago
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1