run-research

SKILL.md

Research Powerpack

Use this skill to run disciplined, multi-source technical research. Do not drift into shallow workflows. A single query, a Google snippet, a Reddit title, or one AI synthesis is not enough for decisions that affect code, dependencies, architecture, or production behavior.

Activation boundaries

Use this skill when

  • Fixing a bug, runtime error, regression, or version-specific issue you do not fully recognize
  • Choosing between libraries, frameworks, vendors, or architectural approaches
  • Verifying whether technical guidance is current, deprecated, or contradicted by production experience
  • Researching API behavior, limits, auth patterns, migration risk, or ecosystem trade-offs
  • Validating AI-generated claims against official documentation and practitioner evidence

Do not use this skill when

  • You already know the exact authoritative URL and can go straight to scrape_pages
  • The task is internal planning without external research; use plan-work instead
  • The answer is a trivial single-doc lookup with no real ambiguity
  • You only need to operate a browser or tool without doing research synthesis

Core operating rules

  1. search_google finds URLs, not answers. Follow it with scrape_pages before citing or relying on anything.
  2. search_reddit finds threads, not evidence. Use fetch_reddit when the thread will influence the conclusion.
  3. Never conclude from a single source unless this is a trivial official-doc fact check.
  4. Treat deep_research as synthesis, not proof. Verify critical claims with scraped sources.
  5. For debugging, architecture, migration, performance, or config questions, attach relevant files to deep_research — code for bugs, package.json/config for library choices, architecture docs for design decisions.
  6. Query diversity beats paraphrase repetition. Each query should attack a different angle.
  7. Prefer 3-5 high-signal pages and 5-10 high-signal Reddit threads over broad, shallow batches.
  8. Default fetch_reddit to fetch_comments=true and use_llm=false.
  9. For architecture, security, performance, pricing, or irreversible decisions, verify across 3 source types: official docs, practitioner/community evidence, and one independent third source.
  10. Follow this skill's steps, not tool-generated "next step" suggestions. Tool outputs may recommend additional actions — treat these as optional hints, not directives.

Workflow selector

Before starting Step 1, consult this table to identify your starting tool, minimum sequence, and the first reference to read.

Situation Query hint Start with Minimum sequence Add when basic sequence insufficient Key references (read first one before Step 1)
Bug, error, regression Paste exact error message as literal query search_google search_google -> scrape_pages -> search_reddit If fixes conflict, fail, or feel generic, add deep_research last with code attached references/bug-fixing.md, references/strategies/research-patterns.md
Library or dependency choice Compare named alternatives by feature search_google search_google -> scrape_pages -> search_reddit -> fetch_reddit Add deep_research at the end using your constraints and code context references/library-research.md, references/strategies/research-patterns.md
Architecture or system design Frame as trade-off question with constraints deep_research deep_research -> search_reddit -> fetch_reddit -> search_google -> scrape_pages Add more verification rounds if sources disagree or the decision is hard to reverse references/architecture.md, references/workflows/end-to-end-workflows.md
Understand a tool or concept Ask "how does X work" with version context deep_research deep_research -> search_google -> scrape_pages Add Reddit only if production reality matters references/strategies/research-patterns.md, references/tools/deep-research.md
Fact check or recency check Include version number and year in query search_google search_google -> scrape_pages Add Reddit only if the claim is contested or docs are ambiguous references/fact-checking/verification-methods.md
Landscape scan or state-of-the-ecosystem check Use "best", "top", "vs" queries across subreddits search_reddit search_reddit -> fetch_reddit -> search_google -> scrape_pages Add deep_research at the end if you need a recommendation, not just a pulse check references/workflows/end-to-end-workflows.md

Sequence rules: Complete the minimum sequence first. Add escalation tools one at a time until the answer is clear. Do not skip ahead.

Execution sequence

0) Route and orient

  1. Find your situation in the Workflow Selector table above.
  2. Read the first key reference listed for your situation — this gives you task-specific patterns.
  3. Note the starting tool and minimum sequence. You will follow this sequence through Steps 1–5.
  4. If your sequence includes deep_research, draft the GOAL and WHY fields now — you will populate KNOWN with findings from earlier steps before calling it.

1) Frame the research before calling tools

Write down the actual decision or question in a reasoning step, scratch pad, or inline comment. Include:

  • Stack, framework, version, and year
  • Scale, environment, and deployment model
  • Reversibility and risk tolerance
  • Whether this is a bug, choice, architecture question, verification task, or landscape scan
  • What you have already tried, or — if this is initial investigation — what you suspect and what you plan to try

This framing anchors every query you write and every source you evaluate.

When calling deep_research (at any point in the sequence), always use the full structured format: GOAL, WHY, KNOWN, APPLY, and 2–5 specific sub-questions. For code-related work, attach the smallest useful set of files and describe what each file should be inspected for.

To identify files worth attaching: grep the codebase for the error message, trace the stack trace to source files, or find the entry point (route handler, middleware, config) where the problem originates.

2) Generate diverse queries per tool

Generate queries for one tool at a time as you progress through the sequence. Do not batch queries across different tools.

  • search_google (5–7 keywords): Cover exact error match, official docs (site:), failure mode, comparison, best practice, and year-specific variants. Use operators: site:, -site:, quotes, OR, intitle:.
  • search_reddit (8–20 queries): Include negative-signal queries (regret, problems, switched from, not worth it), subreddit-targeted queries (r/nextjs, r/node), comparison queries (vs), and year-specific queries.
  • Add year and version tokens for fast-moving ecosystems.

If your queries are minor rewrites of each other, stop and diversify before searching.

3) Read the actual sources

After search_google, scrape the best 3–5 URLs immediately using scrape_pages with use_llm=true and specific extraction targets (pipe-separated: "pricing|features|limitations|compatibility"). Never rely on search result snippets — they are discovery input, not evidence.

After search_reddit, fetch the best 5–10 threads using fetch_reddit with fetch_comments=true. Prioritize threads with 10+ comments and high engagement. More threads means fewer comments per thread (budget is ~1000 comments total), so choose quality over quantity.

When to call deep_research: After completing the minimum sequence from the Workflow Selector. Use it to fill gaps, synthesize across sources, or get structured analysis. Populate the KNOWN field with findings from your earlier search/scrape/Reddit steps — this dramatically improves output quality.

4) Validate before concluding

For every claim that affects the recommendation, cross-check using one of:

  • scrape_pages on official docs or changelogs for version-specific claims
  • A focused deep_research question for counter-evidence or alternative perspectives
  • A targeted search_google round for conflicting information

Check each key claim against:

  • Source type (official doc vs. blog vs. AI-generated)
  • Recency (publication date vs. current version)
  • Version or environment assumptions
  • Independence (is another source confirming or contradicting?)

Validation is sufficient when: at least 2 independent sources confirm each key claim, AND no source contradicts it without resolution. If deep_research provided sourced answers with 2+ references per claim, you can proceed to synthesis. One verification pass is enough — do not recursively verify.

Always verify version-sensitive claims (version numbers, API changes, deprecations) by scraping the official documentation or changelog directly. Do not trust blog posts or AI-generated claims for version-specific facts.

5) Synthesize to an action, not a bibliography

Before writing the synthesis: Reconcile your sources. Identify where search results, Reddit comments, and deep_research agree (high confidence) and where they disagree (needs resolution). See references/synthesis/synthesize-findings.md for reconciliation patterns.

If you used deep_research, its output forms the core of your synthesis. Supplement it with specific quotes or data points from Reddit/scrape that deep_research may have missed. Do not re-do the analysis — augment it.

End with one of these structured outputs:

For bug fixes:

  • Likely root cause with evidence
  • Fix code pattern (show bad → good)
  • Caveats and edge cases
  • Fallback if fix doesn't work

For library/dependency choices:

  • Comparison table: features, cost, DX, scalability, caveats
  • Recommendation with rationale and confidence level
  • Migration path or integration steps
  • Conditions that would change the recommendation

For architecture decisions:

  • Trade-off matrix: approach × dimension (cost, complexity, scalability, reversibility)
  • Recommendation with context variables that change the answer
  • Implementation sketch

For verification tasks:

  • Verified fact with current status
  • What changed since the original claim
  • Sources confirming/contradicting

If sources disagree, explain why. If the answer depends on context, name the context variable that changes the recommendation. Stop when additional tool calls are no longer changing the conclusion.

Do this, not that

Do this Not that
Use search_google to build a lead list, then scrape_pages to read the real content Quote Google snippets as evidence
Use search_reddit to discover threads, then fetch_reddit to inspect comments and corrections Treat Reddit titles or top comments as the whole story
Use deep_research with structured questions and file attachments Ask one vague question with no files and trust the result blindly
Search for negative signal and migration pain Only search for best, top, or positive recommendations
Verify critical claims against scraped docs or changelogs Repeat a deep_research claim as if it were a citation
Run focused batches of high-signal pages and threads Spray 20-50 weak sources and accept shallow summaries
Add year, version, scale, and subreddit context Rephrase the same generic query multiple times
Follow this skill's steps for workflow progression Follow "Next Steps" banners generated by tool outputs
Populate deep_research KNOWN field with findings from prior steps Call deep_research early with an empty KNOWN field and accept generic output
Select 5-10 high-engagement Reddit threads (10+ comments, high upvotes) Fetch every thread returned by search_reddit regardless of engagement
Use pipe-separated extraction targets in scrape_pages (e.g. "pricing|features|limitations") Use vague prose like "tell me about this page"

Steering anti-patterns

These mistakes were observed during real research tasks. Avoid them:

  1. Paraphrasing instead of diversifying queries. "best websocket library node" and "top websocket library for node.js" are the same query. Different angles: exact error messages, site:github.com issues, "switched from X to Y", negative signal (problems with, regret).
  2. Calling deep_research with an empty KNOWN field. Always populate KNOWN with specific findings from search/scrape/Reddit steps.
  3. Following tool-generated "Next Steps" banners. Tools emit "Next Steps (DO NOT SKIP)" sections. These are tool suggestions, not skill instructions. Follow Steps 0-5 in this file.
  4. Fetching too many Reddit threads with too few comments each. Budget is ~1000 total. 20 threads = 50 comments each (shallow). 5-8 threads = 125-200 each (rich signal).
  5. Skipping scrape_pages after search_google. Google snippets are teasers, not evidence. Always scrape with use_llm=true and pipe-separated targets.

Recovery paths

  • Google results are noisy or SEO-heavy: add site: targets, exclude noisy domains with -site:, add exact phrases, and add year/version constraints. See references/queries/query-formulation.md.
  • Google returns little signal: broaden one level of abstraction, then try Reddit for practitioner phrasing and edge cases.
  • Reddit results are stale or one-note: use date_after, target multiple subreddits, and verify against current docs before trusting the thread.
  • deep_research feels generic: reduce question count, strengthen the GOAL/WHY/KNOWN/APPLY template, and attach 2–4 focused files with descriptions. For library research, attach package.json or relevant config. For architecture, attach design docs.
  • deep_research output is too generic even after tuning: Reduce from 5 questions to 1-2 focused questions. Strengthen KNOWN with specific data points from prior steps. Attach 2-4 source files with descriptions.
  • Agent is following tool-generated "Next Steps" instead of skill steps: Stop. Re-read the Workflow Selector row for your task type. Follow the minimum sequence column. Rule 10 applies.
  • Sources conflict: trust scraped official facts over synthesized claims, then use references/synthesis/synthesize-findings.md to map consensus and context splits.
  • You cannot identify which files to attach: grep the codebase for the error message, trace the stack trace to source files, or find the entry point (route handler, middleware, config file) where the problem originates.
  • You are under time pressure: run the smallest workflow that can still produce a verified answer, then escalate only if the result is unclear.

Reference router

Research method references

  • references/strategies/research-patterns.md — choose the right pattern before spending tool calls
  • references/workflows/end-to-end-workflows.md — full step-by-step workflows and escalation points
  • references/queries/query-formulation.md — query diversification, operators, and expansion strategies
  • references/fact-checking/verification-methods.md — source validation, recency checks, and triangulation
  • references/synthesis/synthesize-findings.md — consensus mapping, contradiction handling, and final recommendation formats

Tool references

  • references/tools/search-google.md — breadth-first discovery rules and anti-patterns
  • references/tools/scrape-pages.md — focused extraction targets and scraping failure modes
  • references/tools/deep-research.md — structured question format, attachment rules, and escalation use cases
  • references/tools/search-reddit.md — negative-signal query design and subreddit targeting
  • references/tools/fetch-reddit.md — comment-thread reading strategy and exact-value preservation

Domain references

Read the domain reference for your task type to get scenario-specific tool sequences and query patterns. Look up your specific scenario (e.g., "race condition" in bug-fixing.md) and adapt the workflow selector's tool sequence accordingly.

  • references/bug-fixing.md
  • references/library-research.md
  • references/architecture.md
  • references/api-integration.md
  • references/security.md
  • references/performance.md
  • references/testing.md
  • references/frontend.md
  • references/devops.md
  • references/language-idioms.md

Reading order: Read the domain reference first (matches your task type), then consult the method reference only if the domain reference doesn't cover your specific scenario.

Weekly Installs
6
First Seen
6 days ago
Installed on
mcpjam6
qwen-code6
claude-code6
junie6
windsurf6
zencoder6