web-search-pro
Web Search Pro 2.1
This skill is for agents that need more than one-shot web search.
Use it when the caller needs:
- live web search or current-events lookup
- news search with explainable routing
- official docs, API docs, or code lookup
- company, product, or competitor research
- site crawl, site map, or docs discovery
- a structured evidence pack that can be handed back to an upstream model
This skill is not a narrative report writer. Its job is to search, retrieve, structure, and expose evidence clearly enough that the upstream model can keep reasoning on top of it.
Use This Skill When
- the task starts with web search but may continue into extraction or research
- the agent needs to know why a provider was selected
- the agent may need federated search instead of a single provider
- no-key baseline behavior matters for the first run
- runtime diagnostics or capability discovery are part of the workflow
Do Not Use This Skill When
- the caller only wants the lightest possible single-shot web search wrapper
- the task expects a hosted scraping service
- the task expects the skill itself to write the final polished narrative report
- the caller needs an unlimited no-key search guarantee
Quick Start
The shortest successful path is:
- start with the no-key baseline
- add one premium provider only when stronger recall or freshness is needed
- then try docs, news, and research flows
Option A: No-key baseline
No API key is required for the first successful run.
Baseline roles:
ddg: best-effort web searchfetch: no-key extract / crawl / map fallback
node {baseDir}/scripts/doctor.mjs --json
node {baseDir}/scripts/bootstrap.mjs --json
node {baseDir}/scripts/search.mjs "OpenAI Responses API docs" --json
What these commands are for:
doctor.mjs: is the runtime usable right now?bootstrap.mjs: what can the agent rely on right now?search.mjs: prove the baseline retrieval path succeeds before adding provider credentials
Option B: Add one premium provider
If only one premium provider is added, start with TAVILY_API_KEY.
Reason:
- one credential improves general web search
- one credential improves news search
- one credential improves extract quality
export TAVILY_API_KEY=tvly-xxxxx
node {baseDir}/scripts/doctor.mjs --json
node {baseDir}/scripts/search.mjs "latest OpenAI news" --type news --json
First successful searches
node {baseDir}/scripts/search.mjs "OpenClaw web search" --json
node {baseDir}/scripts/search.mjs "OpenAI Responses API docs" --preset docs --plan --json
node {baseDir}/scripts/extract.mjs "https://platform.openai.com/docs" --json
Then try docs, news, and research
node {baseDir}/scripts/search.mjs "OpenAI Responses API docs" --preset docs --json
node {baseDir}/scripts/search.mjs "latest OpenAI news" --type news --json
node {baseDir}/scripts/research.mjs "OpenClaw search skill landscape" --plan --json
Runtime Contract
The agent should treat these fields as the primary runtime contract.
Routing fields
selectedProviderThe planner's primary route. It does not mean "the only provider used".routingSummaryCompact route explanation withselectionMode,confidence,topSignals, alternatives, blocked providers, and federation summary.routing.diagnosticsFull route diagnostics exposed by--explain-routingor--plan.
Federation fields
federated.providersUsedProviders that actually returned results when fanout is active.federated.value.additionalProvidersUsedNumber of non-primary providers that really contributed.federated.value.resultsRecoveredByFanoutFinal results that would disappear in primary-only mode.federated.value.resultsCorroboratedByFanoutFinal results supported by both the primary and at least one fanout provider.federated.value.duplicateSavingsExact or near-duplicate results removed by merge.
Cache and execution fields
cachedWhether the result came from cache.cacheCache age / TTL telemetry for agent decisions.renderLaneRuntime availability and policy summary for the browser-backed render lane.failedFailed providers or failed retrieval units for the current command.metaCommand-level execution metadata and task input shaping.
Research fields
topicTypePrimary topic class for the research pack.topicSignalsMixed-topic hints such asdocs + latest.researchAxesWhy the research pack decomposed into a given set of subquestions.claimClustersEvidence grouped by normalized claim.candidateFindingsCandidate conclusions with support profile and gap sensitivity.uncertaintiesRemaining uncertainty and follow-up-sensitive gaps.
Why Federated Search Matters
Federation is not just "more providers". It makes multi-provider gain visible so an agent can tell whether fanout improved the final result set.
Important gain metrics:
federated.value.additionalProvidersUsedfederated.value.resultsRecoveredByFanoutfederated.value.resultsCorroboratedByFanoutfederated.value.duplicateSavingsroutingSummary.federation.value
Interpretation:
- recovered results answer "what did fanout rescue?"
- corroborated results answer "what got stronger support?"
- duplicate savings answer "what noise did merge remove?"
Commands By Task
Search
node {baseDir}/scripts/search.mjs "query" --json
node {baseDir}/scripts/search.mjs "query" --plan --json
node {baseDir}/scripts/search.mjs "latest OpenAI news" --type news --json
node {baseDir}/scripts/search.mjs "OpenAI Responses API docs" --preset docs --plan --json
node {baseDir}/scripts/search.mjs "query" --engine serpapi --search-engine baidu --json
User-facing inputs:
searchTypeCurrent shipped values areweb | news.intentPresetCurrent shipped values aregeneral | code | company | docs | research.
Important boundary:
searchTypeandintentPresetshape routing inputengineremains the explicit provider override
Extract and render
node {baseDir}/scripts/extract.mjs "https://example.com/article" --json
node {baseDir}/scripts/extract.mjs "https://example.com/article" --render --render-policy fallback --json
node {baseDir}/scripts/extract.mjs "https://example.com/article" --plan
node {baseDir}/scripts/render.mjs "https://example.com/article" --json
Crawl and map
node {baseDir}/scripts/crawl.mjs "https://example.com/docs" --depth 2 --max-pages 10 --json
node {baseDir}/scripts/map.mjs "https://example.com/docs" --depth 2 --max-pages 50 --json
Research
node {baseDir}/scripts/research.mjs "OpenClaw search skill landscape" --json
node {baseDir}/scripts/research.mjs "OpenClaw search skill landscape" --plan --json
Runtime inspection
node {baseDir}/scripts/capabilities.mjs --json
node {baseDir}/scripts/doctor.mjs --json
node {baseDir}/scripts/bootstrap.mjs --json
node {baseDir}/scripts/review.mjs --json
node {baseDir}/scripts/cache.mjs stats --json
node {baseDir}/scripts/health.mjs --json
Benchmarking
node {baseDir}/scripts/eval.mjs list --json
node {baseDir}/scripts/eval.mjs run --suite core --json
node {baseDir}/scripts/eval.mjs run --suite research --json
node {baseDir}/scripts/eval.mjs run --suite head-to-head --json
node {baseDir}/scripts/eval.mjs run --suite head-to-head-live --json
Research Pack Boundary
research.mjs is a model-facing evidence layer, not a final narrative answer layer.
The skill is responsible for:
- question decomposition
- retrieval planning
- evidence normalization
- source prioritization
- claim clustering
- compact candidate findings
- uncertainty exposure
The upstream model remains responsible for:
- final reasoning across the evidence pack
- narrative synthesis
- user-facing writing style
- final judgment when evidence is incomplete or conflicting
Detailed contract:
Runtime Config
Default config path:
{baseDir}/config.json
Override path:
WEB_SEARCH_PRO_CONFIG=/path/to/config.json
Config precedence:
CLI flags > process.env > config.json > built-in defaults
When the skill is run directly outside OpenClaw, provider keys must already exist in the shell environment. If OpenClaw launches the skill, its injected runtime environment is sufficient.
Key config areas:
routingcachehealthfetchcrawlrender
Important routing config fields:
routing.allowNoKeyBaselinerouting.enableFederationrouting.federationTriggersrouting.maxFanoutProvidersrouting.maxPerProviderrouting.mergePolicyrouting.fallbackPolicy
Important render config fields:
render.enabledrender.policyrender.budgetMsrender.waitUntilrender.blockTypesrender.sameOriginOnly
Provider Upgrade Paths
No API key is required for the baseline. Optional provider credentials or endpoints unlock stronger coverage:
Optional provider credentials or endpoints unlock enhanced features.
TAVILY_API_KEY=tvly-xxxxx
EXA_API_KEY=exa-xxxxx
QUERIT_API_KEY=xxxxx
SERPER_API_KEY=xxxxx
BRAVE_API_KEY=xxxxx
SERPAPI_API_KEY=xxxxx
YOU_API_KEY=xxxxx
SEARXNG_INSTANCE_URL=https://searx.example.com
# Perplexity / Sonar: choose one transport path
PERPLEXITY_API_KEY=xxxxx
OPENROUTER_API_KEY=xxxxx
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
KILOCODE_API_KEY=xxxxx
# Or use a custom OpenAI-compatible gateway
PERPLEXITY_GATEWAY_API_KEY=xxxxx
PERPLEXITY_BASE_URL=https://gateway.example.com/v1
PERPLEXITY_MODEL=perplexity/sonar-pro
Provider roles:
- Tavily: strongest premium default for general search, news, and extract
- Exa: semantic retrieval and extract fallback
- Querit: multilingual AI search with native geo and language filters
- Serper: Google-like search with strong news and locale coverage
- Brave: structured general web search
- SerpAPI: multi-engine routing including Baidu and Yandex
- You.com: LLM-ready web search with freshness and mixed web/news coverage
- SearXNG: self-hosted privacy-first metasearch fallback
- Perplexity / Sonar: answer-first grounded search via native or gateway transport
- DDG: best-effort no-key baseline search
- Fetch: no-key extract / crawl / map baseline
- Render: optional local browser lane
Validation And Review
The most useful validation surfaces for agents and maintainers are:
capabilities.mjsWhat this environment can truly do right now.doctor.mjsIs the runtime ready, degraded, or blocked?bootstrap.mjsWhat can an upstream agent safely assume?review.mjsWhat are the current safety and compliance boundaries?health.mjsWhich providers are degraded or cooling down?eval.mjsHas behavior regressed against core, research, or comparative suites?
The bundled head-to-head suite focuses on route-first comparisons against a local
../web-search-plus checkout.
The bundled head-to-head-live suite adds real networked comparisons for freshness and citation
quality under shared provider credentials.
Review-Safe Guarantees
- metadata only declares
nodeas the hard runtime requirement - provider credentials are optional
ddgis a best-effort no-key baseline, not a guaranteed high-recall provider- Safe Fetch rejects non-HTTP(S), credential-bearing, local, private, and metadata targets
- redirect targets are revalidated
- Safe Fetch keeps JavaScript execution disabled
- browser render is optional and off by default
- browser render uses a local headless browser only when enabled
- browser render reports anti-bot or challenge interstitials as failures instead of silent success
- provider health distinguishes
degradedfromcooldown