skills/adaptationio/skrillz/multi-ai-research

multi-ai-research

SKILL.md

Multi-AI Research & Analysis

Overview

Harnesses three AI systems (Claude via Task tool, Gemini CLI, Codex CLI) for comprehensive research and analysis with multi-perspective verification and iterative refinement.

Purpose: Produce analysis more thorough than any single AI could achieve through specialized roles, cross-validation, and systematic verification.

Key Innovation: Not just parallel execution - specialized research roles with cross-verification and iterative refinement until production-ready (quality ≥95/100, 100% citations, zero gaps).

The 3 AI Systems:

  1. Claude Subagents (via Task tool) - Documentation, codebase analysis, synthesis
  2. Gemini CLI - Web research, latest trends, community practices
  3. Codex CLI - GitHub patterns, code examples, deep reasoning

Quality Guarantees:

  • 100% coverage - All objectives addressed, zero gaps
  • 100% citations - Every claim sourced (file:line or URL)
  • Multi-perspective - 3 AI systems cross-validated
  • ≥95/100 quality - Verified through 3-pass system
  • Actionable - Specific recommendations with examples
  • Resumable - External memory enables multi-session work

When to Use

Use this skill for:

Security Analysis

  • Authentication/authorization assessment
  • Vulnerability identification
  • Best practice validation
  • OWASP Top 10 coverage
  • Penetration testing preparation

Architecture Analysis

  • System design review
  • Component mapping
  • Integration pattern analysis
  • Scalability assessment
  • Technical debt evaluation

Code Quality Analysis

  • Pattern detection
  • Code smell identification
  • Complexity metrics
  • Refactoring opportunities
  • Best practice adherence

Performance Analysis

  • Bottleneck identification
  • Algorithm complexity
  • Resource usage patterns
  • Optimization opportunities
  • Benchmark analysis

Research Synthesis

  • Multi-source research compilation
  • Best practice identification
  • Technology evaluation
  • Pattern discovery
  • Trend analysis

Comprehensive Reviews

  • Pre-production audit
  • System health check
  • Compliance verification
  • Documentation audit
  • Knowledge transfer

Quick Start

Option 1: Automated Script

# Run complete analysis automatically
bash .claude/skills/multi-ai-research/scripts/analyze.sh "Security analysis of authentication system"

This will:

  1. Create analysis plan
  2. Launch parallel research (Claude + Gemini + Codex)
  3. Perform deep analysis
  4. Synthesize and verify
  5. Iterate if needed
  6. Generate final report

Option 2: Interactive Mode

Ask Claude Code to use this skill:

"Use multi-ai-research to analyze [objective]"

Claude will:

  1. Create comprehensive analysis plan
  2. Coordinate all three AI systems
  3. Synthesize findings
  4. Verify quality
  5. Iterate until ≥95 quality
  6. Deliver final report

The 5-Phase Pipeline

Phase 1: Planning & Strategy

Duration: 5-10 minutes Output: .analysis/ANALYSIS_PLAN.md

Claude creates comprehensive plan:

  • Defines objectives and scope
  • Plans file reading strategy (glob → grep → read)
  • Assigns tasks to AI systems
  • Sets verification criteria
  • Defines success thresholds

Phase 2: Parallel Research

Duration: 10-20 minutes Output: .analysis/research/*.md

All three systems research simultaneously:

Claude Subagent:

  • Official documentation analysis
  • Codebase examination (progressive disclosure)
  • Architecture mapping
  • Pattern identification

Gemini CLI:

  • Web research (latest 2024-2025)
  • Community best practices
  • Industry trends
  • Common pitfalls

Codex CLI:

  • GitHub pattern analysis
  • Code examples from top repos
  • Implementation references
  • Testing strategies

Phase 3: Deep Analysis

Duration: 15-30 minutes Output: .analysis/analysis/code-patterns.md

Claude Analysis Agent with extended thinking:

  • Progressive codebase analysis
  • Pattern recognition across sources
  • Architecture mapping
  • Metrics calculation
  • Risk assessment

Phase 4: Synthesis & Verification

Duration: 10-20 minutes Outputs:

  • .analysis/SYNTHESIS_REPORT.md
  • .analysis/verification/cross-check.md

Synthesis (Claude with extended thinking):

  • Read all research findings
  • Identify themes across sources
  • Resolve contradictions
  • Create unified narrative
  • Full citations

Verification (Verification Subagent):

  • 3-pass verification (completeness, accuracy, quality)
  • Cross-source validation
  • Citation checking
  • Gap analysis
  • Quality scoring

Phase 5: Iteration (if needed)

Duration: 10-30 minutes Output: .analysis/iterations/ITERATION_2.md

If quality <95 or gaps exist:

  • Targeted research for gaps
  • Quality improvements
  • Re-verification
  • Repeat until ≥95

Phase 6: Final Report

Duration: 5-10 minutes Output: .analysis/ANALYSIS_FINAL.md

Comprehensive final report:

  • Executive summary
  • Complete findings
  • All sources synthesized
  • Prioritized recommendations
  • Implementation guidance
  • Full citations

Total Time: 45-90 minutes for comprehensive analysis


Analysis Types

Security Analysis

What it checks:

  • Authentication/authorization patterns
  • Input validation
  • Secret management
  • Injection vulnerabilities (SQL, XSS, etc.)
  • Dependency vulnerabilities
  • Rate limiting
  • Session security

Example:

Use multi-ai-research for "Security audit of authentication system"

Output:

  • Critical/High/Medium/Low priority issues
  • OWASP Top 10 coverage
  • Code examples with file:line
  • Specific remediation steps
  • Industry best practices comparison

Architecture Analysis

What it examines:

  • System components and boundaries
  • Integration patterns
  • Data flow
  • Dependency relationships
  • Scalability considerations
  • Design patterns used

Example:

Use multi-ai-research for "Architecture analysis of microservices system"

Output:

  • Component map with relationships
  • Integration pattern analysis
  • Scalability assessment
  • Technical debt identification
  • Refactoring recommendations

Code Quality Analysis

What it analyzes:

  • Code patterns and organization
  • Complexity metrics
  • Code smells
  • Best practice adherence
  • Test coverage
  • Documentation quality

Example:

Use multi-ai-research for "Code quality assessment for ./src"

Output:

  • Quality score with breakdown
  • Pattern analysis
  • Refactoring priorities
  • Specific code improvements
  • Complexity hotspots

Performance Analysis

What it identifies:

  • Algorithm complexity
  • Bottlenecks
  • Resource usage patterns
  • Database query efficiency
  • Network call patterns

Example:

Use multi-ai-research for "Performance bottleneck identification"

Output:

  • Bottleneck analysis with file:line
  • Optimization opportunities
  • Before/after estimations
  • Implementation guidance

Research Synthesis

What it compiles:

  • Official documentation
  • Web best practices
  • GitHub patterns
  • Industry standards
  • Community insights

Example:

Use multi-ai-research for "Research GraphQL federation patterns 2024-2025"

Output:

  • Multi-source synthesis
  • Consensus findings (all sources agree)
  • Multiple perspectives (sources differ)
  • Code examples
  • Implementation recommendations

How It Works

Progressive Disclosure

Never reads files blindly. Always uses 3-level approach:

Level 1: Metadata (glob) - ~50 tokens

glob "**/*.{ts,js,py}"     # Understand structure
glob "**/*.md"             # Find documentation
glob "**/package.json"     # Check dependencies

Level 2: Patterns (grep) - ~5k tokens

grep "export class|interface" --glob "**/*.ts"
grep "TODO|FIXME|BUG" --glob "**/*"
grep "password|secret|token" --glob "**/*.ts"

Level 3: Reading (read) - ~50k tokens

read "src/auth/login.ts"   # Only critical files
read "docs/architecture.md"

Result: 90%+ reduction in unnecessary file reads

External Memory Architecture

All state saved to files, not context:

.analysis/
├── ANALYSIS_PLAN.md           # Strategy and assignments
├── research/
│   ├── claude-docs.md         # Claude research
│   ├── gemini-web.md          # Gemini research
│   └── codex-github.md        # Codex research
├── analysis/
│   ├── code-patterns.md       # Pattern analysis
│   └── architecture-map.md    # System map
├── verification/
│   └── cross-check.md         # Verification results
├── iterations/
│   ├── ITERATION_1.md         # First pass
│   └── ITERATION_2.md         # Gap fills
└── ANALYSIS_FINAL.md          # Complete report

Benefits:

  • Survives context window limits
  • Enables multi-session analysis
  • Resumable from any checkpoint
  • No information loss

Cross-Validation Pattern

High Confidence (★★★★★): All 3 sources agree + code verification Medium Confidence (★★★☆☆): 2/3 sources agree Requires Investigation (★★☆☆☆): Sources conflict

Example:

## JWT Implementation (High Confidence ★★★★★)

**Claude**: "Uses JWT with HS256" (src/auth/jwt.ts:15)
**Gemini**: "HS256 is industry standard 2024" (URL)
**Codex**: "150+ repos use HS256 pattern" (GitHub)
**Code**: Verified at src/auth/jwt.ts:18-22

**Recommendation**: Implementation correct per standards

Quality Scoring

Comprehensive rubric (0-100):

  • Comprehensiveness (/20): All aspects covered
  • Accuracy (/20): All claims sourced and verified
  • Specificity (/20): File:line precision, not vague
  • Actionability (/20): Specific recommendations
  • Consistency (/20): No contradictions

Quality Gates:

  • ≥95: Production-ready
  • 85-94: Needs minor refinement
  • 75-84: Needs iteration
  • <75: Requires rework

Iterative Refinement

Iteration 1 (Breadth): Broad coverage, identifies gaps Iteration 2 (Depth): Fill gaps, improve quality Iteration 3 (Polish): Final verification, perfection

Automatic iteration until:

  • Quality ≥95
  • Citation coverage = 100%
  • Critical gaps = 0

AI System Roles

Claude Subagents (via Task tool)

Research Agent (Haiku):

  • Progressive disclosure expert
  • Documentation analysis
  • Codebase examination
  • Pattern detection

Analysis Agent (Sonnet):

  • Extended thinking for synthesis
  • Multi-source integration
  • Pattern recognition
  • Architectural insights

Verification Agent (Haiku):

  • 3-pass verification
  • Citation checking
  • Gap analysis
  • Quality scoring

Gemini CLI

Strengths:

  • Native web search
  • Latest trends (2024-2025)
  • Community practices
  • Multimodal analysis (if needed)

Use for:

  • Best practice research
  • Industry standards
  • Latest vulnerabilities
  • Framework comparisons

Codex CLI

Strengths:

  • GitHub integration
  • Code pattern search
  • Deep reasoning (o3 model)
  • Implementation examples

Use for:

  • Code examples
  • Design patterns
  • Architecture reasoning
  • Testing strategies

Configuration

Prerequisites

Required:

  • Claude Code (with Task tool access)

Optional but Recommended:

  • Gemini CLI: npm install -g @google/gemini-cli
  • Codex CLI: npm install -g @openai/codex

Note: Skill works with Claude-only fallback if Gemini/Codex unavailable.

Gemini CLI Setup

# Install
npm install -g @google/gemini-cli

# Authenticate (OAuth - free)
gemini
# Follow browser authentication

# Test
gemini -p "test prompt"

Codex CLI Setup

# Install
npm install -g @openai/codex

# Authenticate (ChatGPT Plus/Pro account)
codex login
# Follow browser authentication

# Test
codex exec "test prompt"

Model Selection

Claude:

  • Haiku: Research & verification (fast, efficient)
  • Sonnet: Analysis & synthesis (balanced)
  • Opus: Complex reasoning (if needed)

Gemini:

  • gemini-2.5-flash: Quick research
  • gemini-2.5-pro: Complex analysis

Codex:

  • gpt-5.1-codex: Standard tasks
  • o3: Deep architectural reasoning
  • o4-mini: Quick operations

Examples

Example 1: Security Analysis

Objective: "Security audit of authentication system"

Phase 2 - Parallel Research:
├─ Claude: Analyzes src/auth/* for patterns
├─ Gemini: Researches "OAuth 2.0 security best practices 2024"
└─ Codex: Finds GitHub examples of secure auth

Phase 3 - Analysis:
└─ Claude: Identifies 3 critical, 5 high priority issues

Phase 4 - Synthesis:
└─ All agree: Missing rate limiting (CRITICAL)
   - Claude: No rate limit found in src/auth/login.ts
   - Gemini: OWASP recommends max 5 attempts/hour
   - Codex: 150+ repos use express-rate-limit
   - Recommendation: Implement with Redis backend

Final Report:
├─ Executive summary
├─ 8 issues (3 critical, 5 high) with fixes
├─ OWASP Top 10 coverage
├─ Specific code examples
└─ Priority implementation plan

Quality: 97/100 ✓

Example 2: Architecture Analysis

Objective: "Analyze microservices architecture"

Phase 2:
├─ Claude: Maps services via glob + grep
├─ Gemini: Researches microservices patterns 2024
└─ Codex: Finds service mesh examples

Phase 3:
└─ Claude: Identifies 7 services, 12 integration points

Phase 4:
└─ Synthesis: Service communication patterns
   - Consensus: REST for external, gRPC for internal
   - Trade-offs documented
   - Scaling strategies from Codex examples

Final Report:
├─ Component map (7 services, dependencies)
├─ Integration analysis (12 patterns)
├─ Scalability assessment
└─ Modernization recommendations

Quality: 96/100 ✓

Example 3: Research Synthesis

Objective: "Research state management patterns for React 2024"

Phase 2:
├─ Claude: Reviews React docs + examples
├─ Gemini: Web research "React state management 2024"
└─ Codex: Analyzes top 50 React repos

Phase 3:
└─ Pattern analysis: 5 major approaches identified

Phase 4:
└─ Synthesis by use case:
   - Small apps: Context (all sources agree)
   - Medium apps: Zustand (Gemini + Codex recommend)
   - Large apps: Redux Toolkit (battle-tested, Codex data)
   - Server state: TanStack Query (trending, Gemini research)

Final Report:
├─ Decision tree by project size
├─ Pros/cons with sources
├─ Migration strategies
└─ Code examples from Codex

Quality: 98/100 ✓

Best Practices

1. Be Specific with Objectives

❌ "Analyze the code"
✅ "Security analysis of authentication module for OWASP Top 10 compliance"

2. Trust the Verification

Multi-pass verification catches issues. If quality <95, iteration happens automatically.

3. Review External Memory

Check .analysis/ folder during execution to see progress.

4. Leverage Citations

Every claim has file:line or URL. Use for validation and deep dives.

5. Multi-Session Projects

Large projects can span sessions:

Session 1: Initial analysis → ITERATION_1.md
Session 2: Gap filling → ITERATION_2.md
Session 3: Final polish → ANALYSIS_FINAL.md

6. Check All Three Perspectives

High-value insights often come from comparing AI perspectives.


Troubleshooting

Low Quality Score (<95)

Cause: Gaps in coverage or missing citations Solution: Automatic iteration 2 fills gaps Check: .analysis/verification/cross-check.md for details

Missing Citations

Cause: Verification flags uncited claims Solution: Iteration adds missing attributions Prevention: All agents trained to cite sources

Gemini/Codex Unavailable

Fallback: Claude-only analysis with warning Impact: Reduced perspectives but still comprehensive Install: npm install -g @google/gemini-cli @openai/codex

Conflicting Information

Resolution: Synthesis phase investigates conflicts Method: Check ground truth (actual code/docs) Output: Documented reasoning for resolution


Related Skills

  • anthropic-expert: Anthropic product expertise
  • codex-cli: Codex integration patterns
  • gemini-cli: Gemini integration patterns
  • tri-ai-collaboration: General tri-AI workflows
  • analysis: Code/skill/process analysis

Quick Reference

Command Line

# Full automated analysis
bash .claude/skills/multi-ai-research/scripts/analyze.sh "objective"

# Interactive with Claude Code
# Just ask: "Use multi-ai-research for [objective]"

File Locations

File Purpose
.analysis/ANALYSIS_PLAN.md Strategy and assignments
.analysis/research/ All AI research outputs
.analysis/SYNTHESIS_REPORT.md Multi-source synthesis
.analysis/ANALYSIS_FINAL.md Complete final report

Quality Metrics

Metric Threshold Meaning
Quality Score ≥95/100 Production-ready
Citation Coverage 100% All claims sourced
Completeness ≥95% All objectives met
Critical Gaps 0 No missing essentials

Analysis Time Estimates

Type Time Iterations
Security 45-60 min 1-2
Architecture 60-90 min 1-2
Code Quality 30-45 min 1
Performance 45-60 min 1-2
Research 30-60 min 1

multi-ai-research delivers production-ready analysis through systematic multi-AI collaboration, rigorous verification, and iterative refinement - ensuring nothing is missed and every claim is verified.

Weekly Installs
1
Installed on
claude-code1