browse
browse: Browser & Native App Automation for AI Agents
Target Decision — ALWAYS check this first
Before running any browse command, decide the correct target:
| User wants to... | Target | Command pattern |
|---|---|---|
| Open a URL, test a website, scrape web content | Browser (default) | browse goto <url> |
Test a local dev server (localhost) |
Browser | browse goto http://localhost:3000 |
| Browse a site that blocks bots (Cloudflare, Turnstile) | Camoufox | browse --runtime camoufox --headed goto <url> |
| Browse with a specific camoufox fingerprint profile | Camoufox | browse --runtime camoufox --camoufox-profile <name> --headed goto <url> |
| Search Google, YouTube, Amazon, etc. | Browser | browse goto @google "query" |
| Interact with an iOS app (Settings, Safari, custom app) | iOS Simulator | browse --platform ios --app <bundleId> <cmd> |
| Interact with an Android app (Settings, Chrome, custom app) | Android Emulator | browse --platform android --app <package> <cmd> |
| Interact with a macOS desktop app (System Settings, TextEdit) | macOS App | browse --app <name> <cmd> |
| Install and test an iOS .app or .ipa file | iOS Simulator | browse sim start --platform ios --app ./MyApp.app --visible |
| Install and test an Android .apk file | Android Emulator | browse sim start --platform android --app ./app.apk --visible |
Key rules:
- No
--platformor--appflag → browser target (Chromium). Usegototo navigate. --runtime camoufox --headed→ anti-detection Firefox. Use when site blocks normal browsing. See/browse-stealthskill for Turnstile/CAPTCHA bypass patterns.@macroin goto URL → search macro expansion.browse goto @google "query"expands to Google search URL. 14 macros: @google, @youtube, @amazon, @reddit, @wikipedia, @twitter, @yelp, @spotify, @netflix, @linkedin, @instagram, @tiktok, @twitch, @reddit_subreddit.--appwithout--platform→ macOS app automation. App must be running.--platform ios --app→ iOS Simulator. Usebrowse sim startfirst if not running.--platform android --app→ Android Emulator. Usebrowse sim startfirst if not running.- Native app targets do NOT support:
goto,js,eval,tabs,cookies,route,har. These are browser-only. - All targets support:
snapshot,text,tap,fill,type,press,swipe,screenshot. - If a site blocks you, switch to
--runtime camoufox --headed. If still blocked, use/browse-stealthfor the full Turnstile bypass pattern. - If unsure which target to use, ASK the user. Don't guess — wrong target = wasted work.
Goal
Use the persistent browse CLI to:
- navigate real pages
- inspect rendered content and state
- interact with UI elements
- capture screenshots, console logs, and network activity
- automate native apps (iOS, Android, macOS) via accessibility APIs
- verify browser or app behavior end-to-end without re-launching every step
Step 0: Verify availability and choose the browsing mode
Start by checking:
browse --version
If browse is not installed:
- stop
- tell the user it is required
- point them to the install path in
references/commands.md
Then decide what kind of session you need:
- default session for normal single-agent work
--session <id>for parallel agent isolation--profile <name>for persistent browser identity
For native app targets, start the simulator/emulator first:
browse sim start --platform ios --app com.apple.Preferences --visible
browse sim start --platform android --app com.android.settings --visible
browse enable android # first-time only: auto-installs adb, JDK, SDK, emulator
browse enable ios # first-time only: builds iOS runner (needs Xcode)
browse enable macos # first-time only: builds browse-ax bridge
Success criteria: browse is available, the target (browser or native app) is decided, and the session/profile choice fits the task.
Step 1: Navigate safely and stabilize the page
Use browse goto <url> to navigate.
After navigation, always stabilize before reading or interacting:
browse wait --network-idlefor typical pages and SPAs- or a more specific
browse waitcondition when the page has a known signal
Important rules:
- call
browseas a bare command on PATH - do not use shell variables for browse command prefixes
- avoid
#idCSS selectors; prefer[id=foo] - if the page is untrusted, consider
--content-boundariesand--allowed-domains
Success criteria: The page is loaded enough that content and interactive state are reliable.
Step 2: Choose the cheapest effective inspection method
Use the lightest command that answers the question:
textfor cleaned page contentlinksfor navigation structurejsfor precise targeted extractionconsole,errors, andnetworkfor runtime debuggingsnapshot -ifor interactive elements and stable refs
Prefer snapshot -i before guessing selectors for interaction-heavy tasks.
Load:
references/commands.mdfor exact command syntaxreferences/guides.mdfor command selection guidance and speed rules
Success criteria: You have the information needed without spending unnecessary tokens or using brittle selectors.
Step 3: Interact using refs first, selectors second
For clicks, fills, checks, selects, and similar actions:
- prefer
browse snapshot -i - interact using
@eNrefs - fall back to CSS selectors only when refs are unavailable or impractical
After navigation or DOM refresh:
- assume refs may be invalid
- take a fresh snapshot before continuing
Rules:
- use descriptive screenshots saved under
.browse/sessions/<id>/ - keep stateful flows in the same session unless isolation is intentional
- use
framebefore interacting with iframe content
Success criteria: Interactions are stable and tied to the current rendered page state.
Step 4: Debug blockers and special cases
When things go wrong:
- use
consoleanderrorsfor page/runtime issues - use
networkfor request visibility - use
routeorofflineonly when the task requires mock or failure-mode testing - use headed/browser handoff only for real blockers like CAPTCHA, MFA, or OAuth walls
If you hit a blocker after a couple of failed attempts:
- load
references/guides.md - follow the handoff protocol exactly
- use
AskUserQuestionbefore any human takeover flow
Success criteria: Blockers are either resolved or escalated with the correct handoff protocol.
Step 5: Capture evidence and report clearly
When the task involves verification, capture the minimum evidence needed:
- relevant page text or structured extraction
- screenshot path when visuals matter
- console/network findings when debugging
- the exact step or selector/ref that failed when reporting issues
Report:
- what you navigated to
- what actions you performed
- what the page actually did
- any artifacts created such as screenshots, HAR, or video
Success criteria: Another engineer can understand the observed browser behavior without rerunning the whole flow blindly.
Important Rules
- The browser persists between commands; cookies, tabs, and session state carry over.
- After
goto, wait before reading content or acting. snapshot -iis the default interaction surface.- Save screenshots under
.browse/sessions/<session-id>/or.browse/sessions/default/. - Use
--context deltafor ARIA diff with refs,--context fullfor complete snapshot with refs after write commands. - Do not install anything automatically.
- Do not modify Claude settings automatically; if the user wants pre-allowed browse permissions, point them to
references/permissions.md.
When To Load References
-
references/commands.mdUse for exact command syntax, flags, and extended examples. -
references/guides.mdUse for speed rules, command-choice guidance, architecture notes, and the mandatory CAPTCHA/MFA handoff protocol. -
references/permissions.mdUse when the user wants to pre-allow browse commands in Claude settings.
Guardrails
- Do not add
disable-model-invocation; this is a general-purpose browser verification skill. - Do not add
context: fork; browser results are usually needed in the current flow. - Do not add
paths:; this is a generic workflow skill. - Do not keep the full CLI manual inline in
SKILL.md. - Do not run
browse handoffwithout explicit user confirmation. - Do not save screenshots outside the browse session directories.
Runtime Selection
By default, browse uses Chromium via Playwright. Alternative runtimes:
| Runtime | Engine | Use case | Install |
|---|---|---|---|
playwright (default) |
Chromium | General browsing, testing | Included |
camoufox |
Firefox (anti-detection) | Sites with bot detection | npm install camoufox-js && npx camoufox-js fetch |
rebrowser |
Chromium (stealth) | Alternative stealth approach | npm install rebrowser-playwright |
lightpanda |
Lightpanda | Fast headless rendering | See lightpanda.io |
chrome |
System Chrome | Use real Chrome with extensions | Chrome must be installed |
browse --runtime camoufox --headed goto https://protected-site.com
BROWSE_RUNTIME=camoufox browse goto https://example.com
New Features
Search Macros
browse goto @google "best coffee beans" # Google search
browse goto @youtube "tutorial" # YouTube search
browse goto @amazon "laptop" # Amazon search
browse goto @reddit "programming" # Reddit search
All macros: @google, @youtube, @amazon, @reddit, @reddit_subreddit, @wikipedia, @twitter, @yelp, @spotify, @netflix, @linkedin, @instagram, @tiktok, @twitch
Safety Flags (opt-in features)
| Flag | Default | What it does |
|---|---|---|
BROWSE_CONSENT_DISMISS=1 |
OFF | Auto-dismiss cookie banners after navigation |
BROWSE_CLICK_FORCE=1 or --force |
OFF | Force-click through overlay interception |
BROWSE_READINESS=1 or --ready |
OFF | Wait for hydration after goto |
BROWSE_SERP_FASTPATH=1 or --serp |
OFF | Google SERP DOM extraction (fast, no refs) |
BROWSE_COMMAND_LOCK=0 |
ON | Disable per-session command serialization |
BROWSE_CAMOUFOX_PROFILE=<name> |
OFF | Use a named camoufox profile (.browse/camoufox-profiles/<name>.json) |
New Commands
| Command | Description |
|---|---|
images [sel] [--limit N] [--inline] |
List page images with src/alt/dimensions |
youtube-transcript <url> [--lang en] |
Extract YouTube captions via yt-dlp or browser |
schema |
Extract JSON-LD, Microdata, RDFa structured data (parsed JSON) |
meta |
Extract page meta tags (title, description, canonical, OG, Twitter, hreflang, robots, viewport) |
headings |
Extract H1-H6 heading hierarchy with counts and indented tree |
profiles |
List available camoufox profiles from .browse/camoufox-profiles/ |
Snapshot Windowing
Large snapshots (>80K chars) are automatically paginated:
browse snapshot -i # first page
browse snapshot -i --offset 500 # next page (line offset from previous output)
Output Contract
Report:
- the page or flow tested
- the session/profile mode used if relevant
- the key commands or interactions performed
- the observed result
- any artifacts or blockers such as screenshots, console errors, network failures, or handoff state