skills/ofershap/real-browser-mcp/browser-automation

browser-automation

Installation
SKILL.md

Browser Automation with Real Browser MCP

Use this skill when you need to interact with the user's actual browser - clicking, typing, reading pages, taking screenshots, or navigating.

Before You Start

  1. Verify the extension is connected: try browser_tabs with action "list" first
  2. If disconnected, ask the user to check the extension icon (should show green "ON")
  3. Never close tabs you didn't create

Reading Pages

Start with browser_snapshot to get the accessibility tree. This gives you refs like "e12" that you use for interaction.

For large pages, scope with a CSS selector: browser_snapshot with selector: "main" or selector: ".content".

Use browser_text to extract raw text when you need the full content.

Interacting

Always snapshot first, then use refs:

  • browser_click with ref: "e12" to click
  • browser_type with ref: "e5" and text: "hello" to type
  • browser_press_key with key: "Enter" to submit
  • browser_scroll with direction: "down" to scroll

Dynamic Content (SPAs, social media)

  1. browser_scroll down to load more content
  2. browser_wait with a selector for lazy-loaded elements
  3. Snapshot again after scrolling - refs are regenerated
  4. For virtual scroll containers (Twitter feeds, Reddit), pass the container's CSS selector to browser_scroll

Debugging

  • browser_console reads console.log/warn/error output
  • browser_network shows XHR/fetch requests with status codes
  • browser_screenshot captures what the user sees

Common Mistakes

  • Using stale refs after navigation or scroll (always re-snapshot)
  • Trying to click elements in iframes (scope snapshot to the iframe)
  • Not waiting for page load after navigation
Weekly Installs
1
GitHub Stars
4
First Seen
Mar 5, 2026
Installed on
mcpjam1
claude-code1
replit1
junie1
windsurf1
zencoder1