Browse: Token-Efficient Web Browsing

Guide AI agents through web browsing tasks using the cheapest tool that gets the job done. Every browsing action has a token cost - this skill minimizes it through progressive disclosure, smart format selection, and backend-aware strategies.

Target versions (May 2026):

Lightpanda: 0.2.9
@playwright/mcp: 0.0.72
agent-browser: 0.26.0

When to use

Reading a web page, article, or documentation site
Extracting structured data from a page (prices, tables, metadata)
Filling forms, clicking buttons, or navigating multi-step flows
Scraping content from JavaScript-rendered pages (SPAs)
Browsing behind authentication (login flows)
Any task where the agent needs to see or interact with a live web page

When NOT to use

E2E test automation, test writing, or test debugging - use testing
Building or debugging MCP servers (including browser MCP servers) - use mcp
Network configuration, DNS, reverse proxies - use networking
Fetching API endpoints or REST calls - use curl/fetch directly
Static file downloads - use curl or wget
Web scraping specifically for RAG pipelines or training data - use ai-ml for the pipeline

Tool Selection

Detect what's available and pick the cheapest tool that handles the task.

Detection

MCP browsing tools: look for goto, navigate, markdown, semantic_tree, browser_navigate, browser_snapshot in the available tool list
CLI tools: check lightpanda, agent-browser in PATH
Built-in fetch: WebFetch tool (Claude Code) or platform equivalent
Fallback: curl via shell

Decision matrix

Task	No JS needed	JS needed, read-only	JS needed, interactive
Best	WebFetch / curl	Lightpanda fetch	Lightpanda MCP tools
Good	Lightpanda fetch	MCP markdown tool	agent-browser CLI
Fallback	curl	Playwright MCP	Playwright MCP

If the page works without JavaScript, don't use a browser. If you only need to read content, don't use interactive tools. Escalate only when the cheaper option fails.

Task modes

Mode	Use when	First tool
Static fetch	Content is in HTML	WebFetch, curl, or Lightpanda fetch
JS read-only	Content requires rendering	Lightpanda or Playwright markdown
Interactive	Click, fill, or navigate statefully	MCP browser tools
Authenticated	User-approved account context is required	Existing authenticated browser session
Screenshot	Visual layout matters	Browser screenshot after DOM snapshot
Structured extraction	Tables, JSON-LD, prices, metadata	API, JSON-LD, DOM selectors, then browser

Tool availability check: before starting, verify what's available. If the best tool for the task isn't present, skip straight to the next tier rather than failing mid-workflow.

Performance

Prefer official APIs, sitemaps, or static fetches before launching a browser.
Extract only required page regions; avoid dumping full DOMs, screenshots, or network logs into context.
Reuse browser sessions for multi-step flows, but clear cookies/storage between unrelated accounts or tenants.
Use screenshots only when visual layout or rendered state matters.

Best Practices

Use stable selectors and semantic roles before brittle CSS paths.
Record URL and access date for facts likely to change.
Clear cookies/storage between unrelated accounts or tenants.
Do not automate destructive account actions unless the user names the exact action and target.

Workflow

Step 1: Assess the task

Before touching any tool, answer these in order - each answer narrows the tool choice:

Read or interact? Read-only -> skip to Step 2. Interactive -> go to Step 3/4.
Static or dynamic? View page source or check URL patterns - if the content is in the HTML, it's static. SPA frameworks (React, Vue, Angular) need JS rendering.
Single page or multi-step? Multi-step flows need session persistence (MCP or serve mode).
What output format? Markdown for human reading, structured data / JSON-LD for extraction, semantic tree for element discovery, links for crawl planning.

Step 2: Try the cheapest path first

Static content (docs, articles, blogs):

# Option A: built-in fetch (lowest overhead, no setup)
# Use WebFetch tool with the URL directly

# Option B: Lightpanda CLI (better stripping, selector waits)
lightpanda fetch --dump markdown --strip-mode full <url>

# Option C: curl (always available, raw HTML only)
curl -sL <url>

JS-rendered content (SPAs, dashboards):

# Lightpanda CLI with wait
lightpanda fetch --dump markdown --strip-mode full --wait-until networkidle <url>

# With selector wait for specific content
lightpanda fetch --dump markdown --wait-selector ".main-content" <url>

Interactive tasks (forms, clicks, multi-step): use MCP tools or agent-browser CLI (Step 3).

Step 3: Navigate and extract

With MCP browsing tools (Lightpanda MCP or Playwright MCP):

Navigate first, then extract using the cheapest format:

Format	Tool	Tokens (typical)	Use when
Semantic tree	`semantic_tree` / `browser_snapshot`	~200-500	Finding elements to interact with
Markdown	`markdown`	~500-2000	Reading text content
Links only	`links`	~100-300	Finding URLs to follow
Structured data	`structuredData` / `structured_data`	~100-500	Getting metadata (OpenGraph, JSON-LD)
Interactive elements	`interactiveElements` / `interactive_elements`	~200-400	Finding clickable/fillable elements
Full HTML	page-html resource	~5000-50000	Last resort only

Following links to find data: if the initial extraction doesn't contain the target content (e.g., the page uses images or links to a separate document), extract the page's links first using the links tool or markdown output, identify the relevant link, and fetch that instead. Don't re-fetch the whole page - follow the specific link to the actual data.

With agent-browser CLI:

agent-browser open <url>
agent-browser snapshot -i          # interactive element refs (@e1, @e2...)
agent-browser click @e3            # click by ref
agent-browser fill @e5 "query"     # fill input by ref

With Lightpanda CLI (non-interactive):

lightpanda fetch --dump semantic_tree <url>                          # structure
lightpanda fetch --dump markdown --strip-mode full <url>             # content
lightpanda fetch --dump markdown --wait-until networkidle <url>      # dynamic content

Step 4: Interact (when needed)

For multi-step flows (login, form submission, navigation):

Get interactive elements first - use interactive_elements or semantic_tree to find targets without loading the full DOM
Act on specific elements - click, fill, select using element identifiers
Re-extract after each action - page state changes; get a fresh view
Wait for navigation - after clicks that trigger page loads, wait before extracting

MCP interaction pattern:

1. goto(url)
2. interactive_elements()       - find what to click/fill
3. click(id) or fill(id, value)
4. semantic_tree()              - verify state changed
5. Repeat 2-4 as needed

Step 5: Process results

Single-page reads: if you extracted more than needed, pull out the relevant section before returning it. Don't dump an entire page of markdown when the user asked about one paragraph.

For documentation sites, target the content container. Most docs use predictable selectors:

# Try common content selectors in order of specificity
lightpanda fetch --dump markdown --wait-selector "article" <url>
lightpanda fetch --dump markdown --wait-selector "main" <url>
lightpanda fetch --dump markdown --wait-selector ".content" <url>

If the full page was already fetched, extract the relevant section from the markdown output rather than re-fetching - search for headings or known section titles.

Multi-step workflows:

Cache extraction results rather than re-fetching the same page
Summarize intermediate pages (e.g., search results) instead of returning raw content
Discard navigation/boilerplate content before putting results in context

Step 6: Handle failures

When a tool fails, escalate to the next tier - don't retry the same tool blindly.

Failure	Likely cause	Action
Empty/broken content	JS didn't render	Escalate: WebFetch -> Lightpanda -> Playwright MCP
403 / blocked	Bot detection	Try with `--user-agent-suffix` (Lightpanda) or Playwright (real browser UA)
Timeout	Heavy page / slow network	Increase wait timeout, try `--wait-selector` on specific element
Connection refused	Wrong port / service down	Verify URL, check if site requires VPN or local network
SSL error	Cert issue or MITM	Check cert validity, do not bypass without user confirmation

After login failures: re-check the form field selectors - SPAs frequently change element IDs between deploys. Use interactive_elements to get fresh selectors rather than hardcoding.

Saving fetched content: for file downloads or large extractions, write results to a local file rather than keeping everything in context:

lightpanda fetch --dump markdown --strip-mode full <url> > extracted.md

For binary downloads (PDFs, images) after auth, extract the session cookie from the browser context and hand it to curl. With Playwright MCP or Lightpanda MCP, use evaluate to read document.cookie, then:

curl -L -o report.pdf -b "session=<value>; csrf=<value>" \
  -H "Referer: https://internal.example.com/reports" <pdf-url>

Alternative: trigger the browser's native download via evaluate (document.querySelector('a.download').click()) and let the headless session write to its download directory - avoids moving the cookie out of the browser entirely. This is the only working path for blob: URLs and data: URIs - they are in-memory browser references with no fetchable origin, so curl cannot resolve them; let the page itself resolve the blob via a click or read it with evaluate and FileReader.readAsDataURL to extract the bytes.

Token Efficiency

Progressive disclosure

Start with the cheapest representation. Escalate only when insufficient.

Level 0: URL only (0 tokens)           - sometimes the URL itself answers the question
Level 1: Structured data (~100-300)     - metadata, navigation links
Level 2: Semantic tree (~200-500)       - page structure, interactive elements
Level 3: Markdown (~500-2000)           - readable content
Level 4: Full HTML (~5000-50000)        - complex parsing, last resort

Strip unnecessary content

With Lightpanda CLI, always use --strip-mode:

js - remove script tags
css - remove stylesheets
ui - remove images, video, SVG
full - all of the above (default for content extraction)

Scope extraction

Don't dump the whole page when you need one section:

lightpanda fetch --dump markdown --wait-selector "#pricing-table" <url>

With MCP semantic tree, limit depth:

Tool: semantic_tree
Args: { maxDepth: 3 }    - top 3 levels only

Extract structured data from pages

When you need specific data (prices, tables, metadata) rather than full page content:

Try structured_data / structuredData first - many sites embed JSON-LD or OpenGraph
If no structured data exists, use evaluate / eval to run JavaScript extraction:

Tool: evaluate
Args: { expression: "JSON.stringify([...document.querySelectorAll('.product')].map(p => ({name: p.querySelector('h2')?.textContent, price: p.querySelector('.price')?.textContent})))" }

Parse the JSON result rather than scraping markdown with regex

Batch multi-page work

for url in "$url1" "$url2" "$url3"; do
  lightpanda fetch --dump markdown --strip-mode full "$url"
  printf '\n---\n'
  sleep 1  # rate-limit: don't hammer the same domain
done > output.md

SPA and Dynamic Content

Data extraction priority: try structured data (JSON-LD, structuredData) before markdown parsing, and markdown before full HTML.

Always wait: use --wait-until networkidle or --wait-selector with Lightpanda, waitForSelector / browser_wait_for with MCP tools
Client-side routing: if a link changes the URL without a full page load, re-extract content after each route change
Lazy loading / infinite scroll: scroll to trigger content loading before extracting. For infinite scroll, use a loop: scroll, wait for new content, extract, repeat until you have enough data or no new content appears. Cap iterations to avoid endless scrolling
Cookie consent / popups: dismiss overlays before extracting content - use interactive_elements to find the dismiss button, then click. If the overlay blocks extraction, clicking through it costs fewer tokens than retrying with different formats
Pagination: for paginated results, extract each page sequentially using the "Next" link or pagination controls. Don't try to load all pages at once - extract, process, advance
Verify content loaded: after waiting, check that the extracted content is non-empty and contains expected elements before processing. An empty markdown or a semantic tree with only <html><body> means the page didn't render - escalate to a heavier backend
Lightpanda gaps: partial Web API coverage means some complex SPAs won't render correctly. Fall back to Playwright MCP if extraction returns empty or broken content

Authentication Flows

Navigate to the login page
Use interactive_elements to find form fields
Fill credentials from env vars or user prompt - never hardcode
Submit the form
Wait for redirect to complete (watch for multi-step redirects in OAuth/SSO flows - the URL may bounce through several domains before landing)
Verify login succeeded: extract page content and check for user-specific elements (profile name, dashboard content) before proceeding
Continue browsing the authenticated session

OAuth/SSO redirects: some login flows redirect through identity providers (Google, Okta, Auth0). Follow each redirect, fill credentials at the IdP page, and wait for the final redirect back to the target site. Don't assume login completes on the first page.

MFA prompts: if a TOTP/MFA prompt appears after credentials, you cannot proceed automatically. Inform the user that MFA is required and ask them to complete it manually, or request the TOTP code from the user/env var to fill in.

Session expiration: if extraction suddenly returns login pages or 401s mid-flow, the session has expired. Re-authenticate before continuing. For long-running scrapes, check session validity periodically by verifying a known authenticated-only element is still visible.

Session persistence by backend:

Lightpanda MCP / Playwright MCP: session persists within the MCP connection
Lightpanda CLI fetch: no persistence between calls (use serve mode for multi-step auth)
agent-browser: session-based with --session flag

For session isolation, CSRF-sensitive actions, and multi-tenant account handling, read references/authenticated-browsing.md.

Missing Tools

If no browsing tools are detected, recommend the user set up Lightpanda MCP - it's the fastest path to full browsing capability with minimal overhead.

Lightpanda MCP setup (one-time, ~30 seconds):

# Install the binary (see references/tool-setup.md for other architectures)
curl -L -o lightpanda https://github.com/lightpanda-io/browser/releases/download/0.2.9/lightpanda-x86_64-linux
chmod +x lightpanda && mv lightpanda ~/.local/bin/

Add the MCP server to your Claude Code settings (~/.claude/settings.json or project .mcp.json) - merge with existing config, don't overwrite:

{
  "mcpServers": {
    "lightpanda": {
      "command": "lightpanda",
      "args": ["mcp"],
      "env": { "LIGHTPANDA_DISABLE_TELEMETRY": "true" }
    }
  }
}

Restart the session after adding the MCP config. The Lightpanda tools (goto, markdown, semantic_tree, etc.) will appear in the available tool list.

Read references/tool-setup.md for other platforms, architectures, and alternative backends.

Reference Files

Read references/tool-setup.md when you need installation commands for a specific platform, MCP tool parameter details (full tool tables with token costs), engine-specific CLI flags, or known limitations of a backend. The main SKILL.md covers workflow and strategy; the reference file covers tool-specific depth.

Read references/extraction-patterns.md for static, JavaScript-rendered, screenshot, table, pagination, and attribution patterns.

Read references/authenticated-browsing.md before using saved sessions, cookies, or account-specific pages.

Output Contract

See skills/_shared/output-contract.md for the full contract.

Skill name: BROWSE
Deliverable bucket: audits
Mode: conditional. When invoked to analyze, review, audit, or improve existing repo content, emit the full contract -- boxed inline header, body summary inline plus per-finding detail in the deliverable file, boxed conclusion, conclusion table -- and write the deliverable to docs/local/audits/browse/<YYYY-MM-DD>-<slug>.md. When invoked to answer a question, teach a concept, build a new artifact, or generate content, respond freely without the contract.
Severity scale: P0 | P1 | P2 | P3 | info (see shared contract; only used in audit/review mode).

Related Skills

testing - E2E test automation with Playwright. This skill handles ad-hoc browsing and data extraction; testing handles structured test suites and assertions.
mcp - MCP server development. This skill uses MCP browsing tools; mcp helps build them.
networking - Network infrastructure. This skill browses over the network; networking configures it.
ai-ml - RAG pipelines and web data collection. When scraping content specifically for embeddings or training data, ai-ml covers the pipeline; this skill covers the extraction.

AI Self-Check

Before returning any browsing result, verify:

Current source checked: dated versions, CLI flags, API names, and support windows are verified against primary docs before repeating them
Hidden state identified: local config, credentials, caches, contexts, branches, cluster targets, or previous runs are made explicit before acting
Verification is real: final checks exercise the actual runtime, parser, service, or integration point instead of only linting prose or happy paths
Robots and terms considered: scraping or automation respects access rules, auth boundaries, and rate limits
Dynamic content verified: browser-rendered pages are checked with the real tool when static HTML may be incomplete

Rules

Cheapest tool first. Always try the lowest-token option before escalating. WebFetch before Lightpanda, Lightpanda before Playwright, markdown before full HTML.
Never dump full HTML into context unless no other format works. Full HTML is 10-100x more expensive than markdown or semantic tree for the same information.
Strip before extracting. Use --strip-mode full with Lightpanda CLI. Prefer semantic tree or markdown over raw HTML with MCP tools.
Wait for dynamic content. Don't extract from a half-loaded SPA. Use networkidle, selector waits, or script waits.
No hardcoded credentials. Auth flows must use environment variables, secret managers, or user prompts.
Re-extract after interaction. Page state changes after clicks and form submissions. Always get a fresh view before making decisions based on page content.
Semantic selectors first. Use roles, accessible names, labels, and stable text before brittle CSS selectors.
Screenshots are for visuals. Take screenshots only when layout, rendered state, or visual evidence matters.
Respect robots.txt and rate limits. Use --obey-robots with Lightpanda when scraping. Add a 1-2 second delay between requests when batch-fetching multiple pages from the same domain. Don't hammer sites with rapid sequential requests.

browse