Nimble Web Expert

Web extraction, search, and URL discovery using the Nimble CLI. Returns clean structured data from any website.

User request: $ARGUMENTS

Core principles

Route by intent first. Named site/domain → check for pre-built agent first (announce it out loud). Direct URL → nimble extract. Research/topic → nimble search. Discover/crawl URLs → nimble map or nimble crawl.
One command → present results → done. Run once with --transform "data.parsing" for agents. Show the data immediately as a table. Do NOT experiment, loop, or write Python to parse output.
Multiple inputs → always parallel. 2+ URLs/keywords/ASINs → &+wait. 6–20 → xargs -P. 20+ → Python asyncio script. See references/batch-patterns.md.
Escalate render tiers silently. Tier 1 → 2 → 3 → … without asking. Surface a decision only when all tiers fail and investigation tools are needed.
Never answer from training data. Live prices, current news, today's listings → always fetch via Nimble. If unavailable, say so.
AskUserQuestion at every meaningful choice. Header ≤12 chars, 2–4 options, label 1–5 words, recommended option first. Never present choices as numbered prose.
Save all outputs to .nimble/. Never leave extraction results in memory only.
If bash is denied, stop immediately. Show the command as text and wait. Never retry with dangerouslyDisableSandbox.

Skill ecosystem

Skill	Best for	Key commands
nimble-web-expert (this skill)	Real-time data — fetch any URL, search, map, crawl, run existing agents	`extract`, `search`, `map`, `crawl`, `agent run`
nimble-agent-builder	Build reusable agents — create, refine, publish named extraction templates	CLI: `generate`, `get-generation`, `publish`

Hand off to nimble-agent-builder only when all of these are true: the user has signalled a recurring/scheduled need, the pattern is repetitive (same site, same fields), and they've seen and approved the results. Don't ask after every extract — only when language clearly signals a recurring workflow ("I want to do this every day", "build me a pipeline", "make this reusable").

For agent refinement: "Agent updates are handled by nimble-agent-builder — it can refine the existing agent without rebuilding from scratch."

Interactive UX

Use AskUserQuestion at every meaningful choice — never guess, never ask in prose.
Ambiguous request (no URL, vague topic): ask before running — "What would you like to do?" → Search / Fetch URL / Discover URLs / Call API
Before running a search (if task maps to a specific focus mode): offer focus mode — General / News / Coding / Shopping / Academic / Social
After all tiers fail: check investigation tools (which browser-use, python3 -c "from playwright.sync_api...") and ask whether to investigate with browser-use, Playwright, or skip.
After presenting results, always close with: "Were these results what you needed?" → Looks great! / Mostly good / Not quite / Skip feedback

Prerequisites

Quick check:

nimble --version && echo "${NIMBLE_API_KEY:+API key: set}"

If CLI version and API key: set both print → proceed to Step 0.

If anything is missing, load rules/setup.md for one-time setup instructions (CLI install, API key, Docs MCP).

If bash is denied: Stop. Show the command as text. Do not substitute WebFetch for Nimble tasks.

Analyze & Route

User signal	Command	Notes
Names a specific site or domain	`nimble agent` → `nimble extract` if no agent	Always check for agent first — announce it
Provides a direct URL	`nimble extract`	Skip agent check
Research, topic, or vertical query	`nimble search`	Use focus modes for news, jobs, shopping, etc.
"Find URLs / sitemap / all pages"	`nimble map`	Returns URL list + metadata
"Crawl / archive a whole section"	`nimble crawl`	Async bulk extraction

Step 0 — Agent check (when a domain is named)

Pre-built agents return clean structured data with zero selector work. Always check first.

Always verbalize — never silently:

Announce: "Let me check if there's a pre-built Nimble agent for [site]..."
Report: "Found <agent_name> — using it now." or "No pre-built agent — falling back to extraction."

Lookup order:

~/.claude/skills/nimble-web-expert/learned/examples.json → agents[] array
references/nimble-agents/SKILL.md → baked-in table (50+ sites)
nimble agent list --limit 100 --search "<domain or vertical>" → show table, confirm with user
No match → proceed to extract/search

Run with --transform "data.parsing" — always:

nimble --transform "data.parsing" agent run --agent <name> --params '{"keyword": "..."}'

Do NOT run without --transform "data.parsing" and then parse raw output. The raw response contains html (useless), headers, and parsing (what you want). The transform flag extracts parsing in one shot.

For the full agent list (50+ sites), see references/nimble-agents/SKILL.md.

⚠️ google_search is for SEO/SERP rank analysis only — not general information retrieval. For finding information, use nimble search.

Workflow

Situation	Command	Reference
Site/domain → check agent first	`nimble agent list` → `nimble agent run`	`references/nimble-agents/SKILL.md`
Direct URL	`nimble extract`	`references/nimble-extract/SKILL.md`
Search the live web	`nimble search`	`references/nimble-search/SKILL.md`
Discover URLs on a site	`nimble map`	`references/nimble-map/SKILL.md`
Bulk crawl a section	`nimble crawl run`	`references/nimble-crawl/SKILL.md`
Batch agents (up to 1,000)	`nimble agent run-batch`	`references/nimble-agents/SKILL.md`
Batch extract (up to 1,000)	`nimble extract-batch`	`references/nimble-extract/SKILL.md`
Poll tasks / batches / results	`nimble tasks` / `nimble batches`	`references/nimble-tasks/SKILL.md`
Unknown selectors or XHR path	browser-use or Playwright investigation	`references/nimble-extract/browser-investigation.md`
Proven site patterns	copy a recipe	`references/recipes.md`
2+ inputs	parallel bash `&`+`wait` or generated script	`references/batch-patterns.md`

For the full extract waterfall (tiers, flags, browser actions, network capture), see references/nimble-extract/SKILL.md.

Response shapes

Command	Output
`nimble agent`	Structured data in `data.parsing` — array (SERP/list) or dict (PDP/product)
`nimble extract`	HTML, Markdown, or parsed JSON — depends on `--format` and `--parse`
`nimble search`	Structured results array (title, URL, description)
`nimble map`	URL list + metadata
`nimble crawl`	Async job — poll with `nimble crawl status <job_id>`

Agent runs always need --transform "data.parsing". If the agent name suggests a list (serp, search, plp), expect an array. If it suggests a single item (pdp, product, profile), expect a dict.

Output & Organization

mkdir -p .nimble   # save all outputs here

Naming: .nimble/<site>-<task>.md (e.g. .nimble/amazon-airpods.md, .nimble/yelp-sf-italian.json)

Working with saved files:

wc -l .nimble/page.md && head -100 .nimble/page.md
grep -n "price\|rating" .nimble/page.md | head -30

End every response with: Source: [URL] — fetched live via Nimble CLI

Self-Improvement

The skill maintains ~/.claude/skills/nimble-web-expert/learned/examples.json.

At task start: read the file, scan good[] for url_pattern matches → use documented command/tier as starting point. Scan bad[] → avoid documented pitfalls.
After presenting results: ask "Were these results what you needed?" → on positive feedback, append to good[] with url_pattern, task, command, tier, notes. On negative feedback, ask "What went wrong?" and append to bad[] with url_pattern, task, issue, avoid, better.
Keep entries concise — 5–10 per site. Only write on real feedback, never speculatively.

Guardrails

NEVER answer from training data for live prices, current news, or real-time data. If Nimble is unavailable, say so.
NEVER skip Step 0 silently. Even if certain there's no agent, announce the check before running extract/search/map.
NEVER retry the same render tier. If a tier returns empty or blocked, escalate — do not re-run.
NEVER substitute WebFetch for nimble CLI tasks. WebFetch is a fallback for fetching Nimble docs only.
NEVER load reference files speculatively. Only read a reference when the current task explicitly needs it.
Task agents MUST use run_in_background=False. See nimble-agent-builder delegation model for the why.
Hard retry limit. On error (not empty content): retry at most 2 times with different flags. After 2 errors, report and stop.
Hard 429 rule. On rate-limit error: stop immediately. Do not retry or switch tiers.

Reference files

Load only when needed:

File	Load when
`references/recipes.md`	Need a proven command for a common site (Amazon, Yelp, LinkedIn…)
`references/nimble-agents/SKILL.md`	Step 0 lookup — full agent table (50+ sites)
`references/nimble-extract/SKILL.md`	Extract flags, render tiers, browser actions, network capture, parser schemas
`references/nimble-search/SKILL.md`	Search flags, all 8 focus modes
`references/nimble-map/SKILL.md`	Map flags, response structure
`references/nimble-crawl/SKILL.md`	Full async crawl workflow
`references/nimble-tasks/SKILL.md`	Poll tasks/batches, fetch results — for async, batch, and crawl operations
`references/nimble-extract/browser-investigation.md`	Tier 6 — CSS selector/XHR discovery with browser-use or Playwright
`references/nimble-extract/parsing-schema.md`	Parser types, selectors, extractors, post-processors
`references/nimble-extract/browser-actions.md`	Full browser action types and parameters
`references/nimble-extract/network-capture.md`	Filter syntax, XHR mode, capture+parse patterns
`references/nimble-search/search-focus-modes.md`	Decision tree, mode details, combination strategies
`references/batch-patterns.md`	Parallel bash patterns for 2–5, 6–20, and 20+ inputs
`references/error-handling.md`	Error codes, known site issues, troubleshooting

nimble-web-expert

Nimble Web Expert

Core principles

Skill ecosystem

Interactive UX

Prerequisites

Analyze & Route

Step 0 — Agent check (when a domain is named)

Workflow

Response shapes

Output & Organization

Self-Improvement

Guardrails

Reference files

More from nimbleway/agent-skills

nimble-web-tools

nimble-agents

nimble-web-search

nimble-agent-builder

company-deep-dive

competitor-intel