Grasp — Browser Automation

Grasp gives the AI a persistent Chrome profile (chrome-grasp). Log in once; sessions survive every run.

Prerequisites

Before any action, verify Chrome is reachable:

get_status  →  check "connected: true"

If not connected, ask the user to run:

npx grasp
# or: grasp connect

Core Pattern (3 steps)

1. navigate(url)           → land on page
2. get_hint_map()          → see what's interactable
3. click(hintId) / type()  → act

Repeat steps 2–3 until the task is done. Use get_page_summary or screenshot to verify results.

Re-scan rule: Call get_hint_map again after every navigation, click that loads a new page, or DOM change. Old hint IDs are invalid after any page update.

Hint Map vs Screenshot

Use `get_hint_map`	Use `screenshot`
Finding what to click/type	Verifying visual result
Navigation and interaction	CAPTCHA / visual-only content
Token-efficient perception	Confirming layout after action

Hint Map costs 90%+ fewer tokens than raw HTML or screenshot OCR.

Execution Modes

Standard mode (most pages): Hint Map + real input events via CDP.

WebMCP mode (pages exposing window.__webmcp__): navigate auto-detects it. Use call_webmcp_tool for native API calls. get_status shows current mode.

Safety Mode

High-risk clicks (destructive buttons, payment confirms) are intercepted automatically when GRASP_SAFE_MODE=true (default). Use confirm_click(hintId) to proceed after reviewing.

When Things Go Wrong

Symptom	Fix
`get_hint_map` returns empty	Page still loading — call `get_page_summary` first, then retry
Element not found after click	Page navigated — call `get_hint_map` again to re-scan
Element exists but not clickable	It may be off-screen — `scroll("down")` then re-scan
`watch_element` times out	Action didn't trigger DOM change — check with `screenshot`

Full Tool Reference

See references/tools.md for all tools, parameters, and usage notes.