auto-research
COG Auto Research Skill
When to Invoke
- User asks a strategic question requiring deep research
- User says "research", "auto-research", "investigate", "strategic analysis", "deep dive into [topic]"
- User wants to understand market forces, competitive dynamics, technology trajectories, or strategic options
- User needs evidence-based analysis with real sources to support decision-making
Inspired by Karpathy's autoresearch — but for strategic thinking instead of ML training.
Agent Mode Awareness
Check agent_mode in 00-inbox/MY-PROFILE.md frontmatter:
- If
agent_mode: team— use the full parallel agent execution strategy (5-7 agents). This skill benefits massively from team mode. - If
agent_mode: solo— run 2-3 sequential research passes with WebSearch/WebFetch, produce a lighter analysis without the full multi-thread structure.
Command: /auto-research
Input
The user provides a strategic question or topic as the command argument. Examples:
- "If foundation models commoditize, what happens to LLM wrapper companies like Katalon/Scout?"
- "Future of the testing industry as AI capabilities expand"
- "Should we build vs buy vs partner for our AI layer?"
- "What are the strategic options for Scout if OpenAI launches a testing product?"
Execution Strategy
Phase 1: Question Decomposition (Orchestrator — 2 minutes)
Break the user's strategic question into 5-7 research threads that together will provide a comprehensive answer. Each thread should be:
- Independent — can be researched in parallel
- Specific — has a clear research objective
- Complementary — together they cover the full strategic landscape
Decomposition framework:
- Market forces — what macro trends drive this question?
- Historical precedent — has this pattern played out before in other industries?
- Player analysis — who are the key players and what are they doing?
- Technology trajectory — where is the underlying tech heading?
- Customer behavior — what do end-users actually want/do?
- Economic model — what are the unit economics and value capture dynamics?
- Emerging tech & architectures — what concepts, projects, or frameworks are still in development/discussion (pre-mainstream) that could be foundational? Research open-source projects, research papers, GitHub repos, Discord/forum discussions, conference talks, and early-stage tools that are relevant. Examples: novel agent architectures, new testing paradigms, experimental frameworks. These may not have polished docs — dig into READMEs, GitHub issues, Twitter/X threads, blog posts from builders, and academic preprints.
- Contrarian view — what's the strongest argument against the consensus?
Not all threads apply to every question. Pick the 5-7 most relevant. Thread 7 (Emerging tech) should ALWAYS be included — the user specifically wants to stay ahead of concepts that aren't mainstream yet.
Before spawning agents:
- Read relevant files from the vault for existing context:
05-knowledge/for existing frameworks and mental models04-projects/for project-specific context if relevant- Recent braindumps for the user's existing thinking on this topic
- State the decomposition to the user so they can course-correct before agents launch
Phase 2: Parallel Deep Research (Spawn 5-7 Agents Simultaneously)
CRITICAL: Launch ALL agents in a single message. Use run_in_background: true for all agents.
Each agent gets a detailed prompt following this template:
You are a strategic research analyst investigating a specific thread of a larger strategic question.
MAIN QUESTION: [user's original question]
YOUR THREAD: [specific research thread]
EXISTING CONTEXT: [any relevant vault context]
RESEARCH METHODOLOGY:
1. WebSearch for 8-12 high-quality sources (prioritize: research reports, expert analyses, company filings, academic papers, industry publications — NOT listicles or superficial blog posts)
2. For each source found, WebFetch to read the full content and extract key arguments, data points, and frameworks
3. Look for CONFLICTING viewpoints — don't just confirm one narrative
4. Identify specific data points, statistics, and concrete examples
5. Note the credibility and potential bias of each source
6. FOR EMERGING TECH THREADS: Go beyond polished sources. Search GitHub repos (README, issues, discussions), Twitter/X threads from builders, Discord/forum discussions, conference talk summaries, arXiv preprints, and early blog posts. The goal is to surface concepts that are pre-mainstream but technically promising. For each concept found, assess: maturity level, technical approach, relevance to the user's use case, and what it would take to adopt/integrate.
OUTPUT FORMAT (return ALL of this):
## Thread: [thread name]
### Key Findings (3-5 bullet points)
- Finding with source attribution
### Evidence & Data Points
- Specific statistics, market data, examples with sources
### Expert/Notable Perspectives
- Named perspectives from credible voices
### Implications for [user's context]
- What this means specifically for the user's situation
### Confidence Level
- HIGH / MEDIUM / LOW with reasoning
### Sources
- Numbered list of actual URLs consulted
Agent naming convention: research-[thread-slug] (e.g., research-market-forces, research-historical-precedent)
Phase 3: Synthesis (Orchestrator — after all agents complete)
Once all agents return, synthesize into a single strategic analysis document:
Document Structure:
---
type: strategic-research
domain: [auto-detect from question]
date: YYYY-MM-DD
question: "[original question]"
threads: [list of research threads]
confidence: [overall confidence HIGH/MEDIUM/LOW]
tags:
- auto-research
- strategy
- [topic tags]
status: complete
---
# [Strategic Question as Title]
## Executive Summary
3-5 sentences capturing the core insight. Lead with the answer, not the process.
## The Strategic Landscape
Synthesized view across all research threads. Not a thread-by-thread dump — weave findings together into a coherent narrative.
## Key Forces at Play
The 3-4 most important dynamics shaping this question, with evidence from multiple threads.
## Scenarios
### Scenario A: [Most Likely] — X% confidence
What happens, timeline, implications
### Scenario B: [Optimistic/Alternative]
What happens, timeline, implications
### Scenario C: [Worst Case/Disruption]
What happens, timeline, implications
## Emerging Tech & Architectures to Watch
Concepts, projects, and frameworks that are still in development/discussion but could be foundational. For each:
- **What it is:** One-paragraph explanation
- **Maturity:** Pre-alpha / Alpha / Early adoption / Growing community
- **Technical approach:** How it works architecturally
- **Relevance to our use case:** Why it matters for us specifically
- **Adoption path:** What it would take to integrate/adopt — effort, risks, dependencies
- **Key links:** GitHub repo, paper, discussion thread
## Strategic Options
For each option:
- **Description:** What this means concretely
- **Pros:** With evidence
- **Cons:** With evidence
- **Prerequisites:** What needs to be true
- **Timeline:** When to decide/act
- **Emerging tech leverage:** Which emerging concepts from above could strengthen this option
## Recommended Actions
Prioritized, concrete, time-bound action items. Not vague "consider X" — specific "do X by Y because Z."
Include a separate "Tech Bets" subsection: which emerging projects to start experimenting with now, even if they're not production-ready.
## Contrarian View
The strongest argument against the consensus/recommended path. What could make all of this wrong?
## Confidence & Gaps
- What we're confident about and why
- What we couldn't determine and what additional research would help
- Key assumptions that should be monitored
## Sources
Consolidated, deduplicated list of all sources across threads.
Phase 4: Save & Deliver
- Save the full analysis to
05-knowledge/research/YYYY-MM-DD-[slug].md - If the analysis is long (>3000 words), also create a brief 1-page summary at
05-knowledge/research/YYYY-MM-DD-[slug]-summary.md - Present the Executive Summary + Recommended Actions to the user directly in chat
Quality Standards
- No hallucinated sources. Every claim must trace to a real WebSearch/WebFetch result.
- Recency matters. Prioritize sources from the last 6 months. Flag anything older.
- Bias awareness. Note when sources have obvious commercial incentives.
- Specificity over generality. "The testing tools market is $XX.XB and growing at YY% CAGR" beats "the market is growing."
- Actionability. The output should help the user make a decision, not just understand a topic.
- Intellectual honesty. If the research is inconclusive, say so. Don't manufacture false confidence.
Example Decomposition
Question: "If generic LLM models get better over time, what's the future for LLM wrapper companies like Katalon or Scout?"
Threads:
- Foundation model trajectory — How fast are GPT/Claude/Gemini improving at code understanding, test generation, bug detection? What's the capability curve?
- Historical precedent: platform commoditization — What happened to companies built on top of AWS, iOS, Salesforce, etc. when the platform absorbed their features? Who survived and why?
- Testing industry structure — Current market map, value chain, where margin lives, what buyers actually pay for
- Wrapper company strategies — How are current AI wrapper companies (Jasper, Copy.ai, Cursor, etc.) adapting? What's working?
- Enterprise buying behavior — Do enterprises buy "AI" or do they buy "solutions"? What's the procurement reality?
- Emerging tech & architectures — What pre-mainstream concepts could reshape the landscape? (e.g., novel agent frameworks, new testing paradigms, computer-use agents, browser automation architectures). Search GitHub repos, arXiv, Twitter/X builder threads, Discord communities, conference talks.
- Defensibility analysis — What moats exist for testing-specific AI companies? Data, workflow, integration, brand, switching costs?
- Contrarian: wrappers win — Arguments for why vertical AI companies might actually INCREASE in value as models commoditize
Runtime Expectations
- Phase 1: ~2 minutes (decomposition + user confirmation)
- Phase 2: ~5-10 minutes (parallel research, longest agent determines total time)
- Phase 3: ~3-5 minutes (synthesis)
- Total: ~10-15 minutes for a comprehensive strategic analysis
Error Handling
- If a research thread returns low-quality results, note this in the synthesis rather than fabricating depth
- If WebSearch/WebFetch fails for a thread, retry once with alternative search terms, then document the gap
- The user may interrupt during Phase 2 to redirect or add threads
- The skill can be run multiple times on related questions — reference previous research files from
05-knowledge/research/
Fallback Behavior
This skill requires WebSearch and WebFetch tools. If these are unavailable:
- Fall back to vault-only analysis using existing
05-knowledge/content - Clearly state that no live web research was performed
- Recommend the user run the skill again when web tools are available
More from huytieu/cog-second-brain
onboarding
Personalize COG for your workflow - creates profile, interests, and watchlist files with guided setup (run this first!)
18knowledge-consolidation
Build frameworks from scattered insights across all braindumps and notes
18daily-brief
Generate personalized news intelligence with verified sources (7-day freshness requirement)
17generate-prd
Generate product requirements documents with optional publishing to Confluence or other wiki platforms
17update-cog
Check for and apply upstream COG framework updates (skills, docs, scripts) without touching personal content
17braindump
Quick capture of raw thoughts with intelligent domain classification and competitive intelligence extraction
17