browser-cdp
Browser CDP
Control a real browser through a Chrome DevTools Protocol proxy.
Overview
This skill provides browser automation via a lightweight HTTP proxy that wraps CDP. The proxy exposes REST endpoints for navigation, screenshots, JS evaluation, clicking, and more — no Playwright/Puppeteer dependency needed.
Prerequisites
Install the required Python dependency:
pip install psutil
A CDP proxy must be running on http://localhost:3456. Start it from the repository root with:
python3 skills/browser-cdp/scripts/cdp_proxy.py
This launches Chrome/Edge with remote debugging enabled and proxies CDP commands over HTTP.
When to Use
✅ USE this skill when:
- "Open this URL and tell me what's on the page"
- "Take a screenshot of the current page"
- "Run this JavaScript on the page"
- "Click the button that says..."
- "Search for and install a Chrome extension"
- "Log into this site and do something"
- Any task requiring a real browser context
❌ DON'T use this skill when:
- Simple HTTP API calls → use
curldirectly - Downloading files → use
curl -O - Parsing HTML from a saved file → use
python3with BeautifulSoup - No CDP proxy running → ask the user to start it first
API Reference
All endpoints are relative to http://localhost:3456.
GET /targets
List all open browser tabs.
curl -s http://localhost:3456/targets | python3 -m json.tool
Response:
[
{ "id": "ABC123", "title": "Google", "url": "https://google.com" }
]
GET /navigate?url=
Navigate a tab to a URL. Uses the most recently created tab, or specify ?target=<targetId>.
curl -s "http://localhost:3456/navigate?url=https://example.com"
GET /screenshot
Take a PNG screenshot of the current page.
# Save to file
curl -s -o screenshot.png http://localhost:3456/screenshot
POST /eval
Execute JavaScript in the page. The request body is plain text (not JSON), sent as Content-Type: text/plain.
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "document.title"
For multi-line scripts, pipe from stdin or use a heredoc:
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "JSON.stringify(Array.from(document.querySelectorAll('a')).map(a => ({text: a.innerText, href: a.href})))"
GET /click?selector=
Click an element matching a CSS selector.
curl -s "http://localhost:3456/click?selector=%23submit-btn"
GET /new
Open a new browser tab and return its target ID.
curl -s http://localhost:3456/new
Response:
{ "id": "NEW_TAB_ID", "title": "about:blank", "url": "about:blank" }
Common Workflows
Navigate and extract page content
# Open a page
curl -s "http://localhost:3456/navigate?url=https://example.com"
# Extract all text content
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "document.body.innerText"
# Extract all links
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
Take a screenshot
curl -s "http://localhost:3456/navigate?url=https://example.com"
curl -s -o page.png http://localhost:3456/screenshot
Search and install a Chrome extension
# Search the Chrome Web Store (no login required for search)
curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/search/example%20extension"
# Extract extension IDs from search results
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "JSON.stringify([...document.querySelectorAll('a[data-id]')].map(a => ({id: a.dataset.id, title: a.textContent.trim()})))"
# Install an extension (requires the extension ID)
curl -s "http://localhost:3456/navigate?url=https://chromewebstore.google.com/detail/<extension-id>"
# Then click the "Add to Chrome" button
curl -s "http://localhost:3456/click?selector=%5Bdata-id%3Dinstall-button%5D"
Fill a form and submit
# Navigate to the form
curl -s "http://localhost:3456/navigate?url=https://example.com/login"
# Fill in fields
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "document.querySelector('#username').value = 'myuser'"
curl -s -X POST http://localhost:3456/eval \
-H "Content-Type: text/plain" \
-d "document.querySelector('#password').value = 'mypass'"
# Submit
curl -s "http://localhost:3456/click?selector=%23login-form+%3E+button"
Notes
- The CDP proxy must be running before using any commands
- If the proxy is not running, ask the user to start it:
python3 skills/browser-cdp/scripts/cdp_proxy.py - Use URL encoding for query parameters with special characters
- The
/evalendpoint returns the result of the last expression (like a REPL) - Screenshots are returned as PNG binary data
- For complex multi-step interactions, chain
/evaland/clickcalls - The proxy supports a
?target=<targetId>parameter on most endpoints to target a specific tab
More from linuxhsj/openclaw-zero-token
imsg
iMessage/SMS CLI for listing chats, history, and sending messages via Messages.app.
23obsidian
Work with Obsidian vaults (plain Markdown notes) and automate via obsidian-cli.
23coding-agent
Delegate coding tasks to Codex, Claude Code, or Pi agents via background process. Use when: (1) building/creating new features or apps, (2) reviewing PRs (spawn in temp dir), (3) refactoring large codebases, (4) iterative coding that needs file exploration. NOT for: simple one-liner fixes (just edit), reading code (use read tool), thread-bound ACP harness requests in chat (for example spawn/run Codex or Claude Code in a Discord thread; use sessions_spawn with runtime:"acp"), or any work in ~/clawd workspace (never spawn agents here). Claude Code: use --print --permission-mode bypassPermissions (no PTY). Codex/Pi/OpenCode: pty:true required.
22sherpa-onnx-tts
Local text-to-speech via sherpa-onnx (offline, no cloud)
22sonoscli
Control Sonos speakers (discover/status/play/volume/group).
20skill-creator
Create, edit, improve, or audit AgentSkills. Use when creating a new skill from scratch or when asked to improve, review, audit, tidy up, or clean up an existing skill or SKILL.md file. Also use when editing or restructuring a skill directory (moving files to references/ or scripts/, removing stale content, validating against the AgentSkills spec). Triggers on phrases like "create a skill", "author a skill", "tidy up a skill", "improve this skill", "review the skill", "clean up the skill", "audit the skill".
20