research-synthesis-workflow
SKILL.md
Research Synthesis Workflow
This skill provides a systematic methodology for conducting research, synthesizing findings from multiple sources, and producing actionable knowledge artifacts.
Core Competencies
- Source Evaluation: Assessing credibility, relevance, and bias
- Information Extraction: Systematic note-taking and annotation
- Synthesis Methods: Thematic analysis, meta-analysis, framework building
- Knowledge Artifacts: Reports, literature reviews, decision frameworks
Research Workflow Overview
┌──────────────────────────────────────────────────────────────┐
│ Research Synthesis Workflow │
├──────────────────────────────────────────────────────────────┤
│ │
│ 1. SCOPE 2. GATHER 3. EXTRACT │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ Define │─────▶│ Find │─────▶│ Capture │ │
│ │ Question│ │ Sources │ │ Insights│ │
│ └─────────┘ └─────────┘ └─────────┘ │
│ │ │ │
│ │ 5. PRODUCE 4. SYNTHESIZE │
│ │ ┌─────────┐ ┌─────────┐ │
│ └─────────▶│ Create │◀─────│ Connect │ │
│ │ Artifact│ │ Themes │ │
│ └─────────┘ └─────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘
Phase 1: Scope Definition
Research Question Framework
Transform vague topics into answerable questions:
| Type | Pattern | Example |
|---|---|---|
| Exploratory | What is X? How does X work? | What is vector search? |
| Comparative | How does X compare to Y? | PostgreSQL vs. Neo4j for graphs? |
| Evaluative | Is X effective for Y? | Is RAG effective for technical docs? |
| Causal | What causes X? What are effects of X? | What causes LLM hallucinations? |
| Prescriptive | How should we implement X? | How to design a RAG pipeline? |
Scope Boundaries
Define explicitly:
- In scope: Topics to cover
- Out of scope: Adjacent topics to exclude
- Depth: Survey (broad) vs. deep-dive (narrow)
- Time bounds: Cut-off dates for sources
- Source types: Academic, industry, primary data
Example Scope Document
## Research Scope: Vector Database Selection
### Research Question
Which vector database best fits our production RAG system
requiring <50ms latency at 10M+ vectors?
### In Scope
- Pinecone, Weaviate, Milvus, Qdrant, pgvector
- Latency benchmarks at scale
- Cost analysis (cloud vs self-hosted)
- Operational complexity
### Out of Scope
- General-purpose databases with vector extensions
- Sub-million vector use cases
- Academic/research-only systems
### Success Criteria
Recommendation with supporting evidence for 2-3 top candidates
Phase 2: Source Gathering
Source Quality Assessment
Evaluate each source on:
| Criterion | High Quality | Low Quality |
|---|---|---|
| Authority | Expert author, peer-reviewed | Anonymous, no credentials |
| Currency | Recent, updated | Outdated, no dates |
| Accuracy | Citations, verifiable | Unsupported claims |
| Purpose | Inform, educate | Sell, persuade |
| Coverage | Comprehensive | Superficial |
Source Types and Uses
Primary Sources (original)
├── Research papers
├── Official documentation
├── Benchmark data
└── Expert interviews
Secondary Sources (analysis)
├── Review articles
├── Technical blogs
├── Industry reports
└── Book chapters
Tertiary Sources (summaries)
├── Wikipedia
├── Textbooks
└── Encyclopedias
Search Strategies
Keyword expansion:
- Start: "vector database performance"
- Expand: "approximate nearest neighbor", "HNSW benchmark", "embedding search latency"
Citation chaining:
- Forward: Who cites this paper?
- Backward: What does this paper cite?
Author tracking:
- Find key researchers, follow their work
Source Documentation
For each source, capture:
## Source: [Title]
- **URL/DOI**:
- **Author(s)**:
- **Date**:
- **Type**: [paper/blog/docs/report]
- **Quality Score**: [1-5]
- **Relevance**: [high/medium/low]
- **Key Topics**:
- **Notes**:
Phase 3: Information Extraction
Structured Note-Taking
Use consistent templates for extraction:
## Claim: [Specific assertion]
- **Source**: [reference]
- **Evidence**: [supporting data/reasoning]
- **Strength**: [strong/moderate/weak]
- **My Assessment**: [agree/disagree/uncertain]
- **Related Claims**: [links to other notes]
Evidence Classification
| Type | Description | Weight |
|---|---|---|
| Empirical | Measured data, experiments | High |
| Analytical | Logical derivation | Medium-High |
| Anecdotal | Case studies, examples | Medium |
| Expert Opinion | Authority statements | Medium |
| Theoretical | Model predictions | Medium-Low |
Contradiction Tracking
When sources disagree:
## Conflict: [Topic]
### Position A: [Claim]
- Sources: [list]
- Evidence: [summary]
### Position B: [Claim]
- Sources: [list]
- Evidence: [summary]
### Analysis
- Methodological differences:
- Context differences:
- Possible resolution:
- My conclusion:
Phase 4: Synthesis
Thematic Analysis
- Code individual insights with tags
- Cluster related codes into themes
- Review themes for coherence
- Define each theme clearly
- Relate themes to research question
Codes Themes Findings
├─ fast queries ─┐
├─ low latency ─┼── Performance ─┬── Theme 1: Performance
├─ high throughput ─┘ │ varies significantly
├─ managed service ─┐ │ by workload type
├─ self-hosted ─┼── Deployment ─┼── Theme 2: Cloud vs
├─ kubernetes ─┘ │ self-hosted tradeoff
├─ pricing tiers ─┐ │
├─ compute costs ─┼── Economics ─┴── Theme 3: Total cost
├─ hidden costs ─┘ drives final choice
Framework Building
Create decision frameworks from synthesis:
## Vector Database Selection Framework
### Decision Tree
1. Scale requirement?
- <1M vectors → pgvector (simplicity)
- 1M-100M vectors → Continue to 2
- >100M vectors → Milvus/Weaviate (distributed)
2. Operational capacity?
- Limited DevOps → Pinecone (managed)
- Strong DevOps → Continue to 3
3. Cost sensitivity?
- Budget constrained → Qdrant (open source)
- Budget flexible → Evaluate all options
### Comparison Matrix
| Criterion | Weight | Pinecone | Milvus | Qdrant |
|----------------|--------|----------|--------|--------|
| Latency | 30% | 4 | 5 | 4 |
| Scalability | 25% | 5 | 5 | 4 |
| Operations | 20% | 5 | 3 | 4 |
| Cost | 15% | 2 | 4 | 5 |
| Features | 10% | 4 | 5 | 4 |
| **Weighted** | | **4.0** | **4.4**| **4.2**|
Phase 5: Knowledge Artifact Production
Artifact Types
| Format | Purpose | Audience |
|---|---|---|
| Executive Summary | Quick decision support | Leadership |
| Technical Report | Detailed analysis | Engineers |
| Literature Review | Academic synthesis | Researchers |
| Decision Framework | Structured evaluation | Decision makers |
| Reference Guide | Quick lookup | Practitioners |
Structure Templates
Executive Summary (1-2 pages):
- Context and question
- Key findings (3-5 bullets)
- Recommendation
- Risks and considerations
Technical Report (5-20 pages):
- Executive summary
- Background and scope
- Methodology
- Findings by theme
- Analysis and discussion
- Recommendations
- Appendices (data, sources)
Quality Checklist
Before finalizing:
- Research question answered?
- All claims supported by evidence?
- Contradictions addressed?
- Limitations acknowledged?
- Actionable recommendations?
- Sources properly cited?
- Appropriate for audience?
Best Practices
Avoiding Bias
- Seek disconfirming evidence actively
- Include multiple perspectives
- Note your priors and update them
- Separate observation from interpretation
- Document methodology for transparency
Managing Scope Creep
- Return to research question frequently
- Park interesting tangents in "Future Research"
- Time-box each phase
- Define "good enough" criteria upfront
Iteration
Research is rarely linear:
- New sources may require scope adjustment
- Synthesis may reveal gaps requiring more gathering
- Artifacts may need multiple drafts
References
references/evaluation-rubrics.md- Source quality scoring guidesreferences/synthesis-methods.md- Detailed synthesis techniquesreferences/artifact-templates.md- Document templates and examples
Weekly Installs
2
Repository
4444j99/a-i--skillsGitHub Stars
3
First Seen
7 days ago
Security Audits
Installed on
amp2
cline2
openclaw2
opencode2
cursor2
kimi-cli2