web-browser

SKILL.md

browser-automation

Overview

Browser automation skill with two approaches:

agent-browser - Snapshot-based interaction model optimized for AI agents

  • Compact element refs (@e1, @e2) reduce token usage dramatically
  • Workflow: opensnapshot -i → interact with refs → re-snapshot
  • Best for: dynamic exploration, form filling, scraping with unknown structure

playwright - Direct Playwright CLI and Node.js scripts

  • Full Playwright API access via scripts
  • Codegen for recording interactions
  • Best for: scripted automation, testing, batch operations, complex workflows

Sub-skills

CRITICAL: You MUST load the appropriate sub-skill from the sub-skills/ directory based on user intent.

When to use each

Sub-skill When to use Triggers
agent-browser.md Interactive exploration, AI-driven navigation, unknown page structure "navigate to", "fill this form", "click the button", "scrape this page", "explore the site"
playwright.md Scripted automation, testing, batch screenshots, codegen "write a script", "generate test", "batch screenshot", "record my actions", "create automation script"

Default behavior

  • If user intent is unclear, prefer agent-browser for interactive tasks
  • If user asks for "a script" or "automation code", use playwright
  • If user mentions "codegen" or "record", use playwright

Process

  1. Determine user intent from their request
  2. Load the appropriate sub-skill from sub-skills/
  3. Execute the sub-skill process
  4. Verify expected outcome was achieved

Resources

  • sub-skills/: Approach-specific instructions
    • agent-browser.md: Snapshot/refs workflow with npx agent-browser
    • playwright.md: Playwright CLI and Node.js scripts
  • references/agent-browser/: Deep-dive documentation for agent-browser
  • templates/agent-browser/: Ready-to-use shell scripts for agent-browser

Quick reference

agent-browser (default for interactive tasks)

# Session isolation (generate random slug like bright-falcon)
npx agent-browser --session <slug> open https://example.com
npx agent-browser --session <slug> snapshot -i
npx agent-browser --session <slug> click @e1
npx agent-browser --session <slug> fill @e2 "text"

playwright (for scripts and codegen)

# Quick screenshot
npx playwright screenshot https://example.com output.png

# Record interactions as code
npx playwright codegen https://example.com

# PDF generation
npx playwright pdf https://example.com output.pdf
Weekly Installs
4
GitHub Stars
1
First Seen
13 days ago
Installed on
claude-code4
mcpjam1
kilo1
junie1
windsurf1
zencoder1