multi-ai-research
Multi-AI Research & Analysis
Overview
Harnesses three AI systems (Claude via Task tool, Gemini CLI, Codex CLI) for comprehensive research and analysis with multi-perspective verification and iterative refinement.
Purpose: Produce analysis more thorough than any single AI could achieve through specialized roles, cross-validation, and systematic verification.
Key Innovation: Not just parallel execution - specialized research roles with cross-verification and iterative refinement until production-ready (quality ≥95/100, 100% citations, zero gaps).
The 3 AI Systems:
- Claude Subagents (via Task tool) - Documentation, codebase analysis, synthesis
- Gemini CLI - Web research, latest trends, community practices
- Codex CLI - GitHub patterns, code examples, deep reasoning
Quality Guarantees:
- ✓ 100% coverage - All objectives addressed, zero gaps
- ✓ 100% citations - Every claim sourced (file:line or URL)
- ✓ Multi-perspective - 3 AI systems cross-validated
- ✓ ≥95/100 quality - Verified through 3-pass system
- ✓ Actionable - Specific recommendations with examples
- ✓ Resumable - External memory enables multi-session work
When to Use
Use this skill for:
Security Analysis
- Authentication/authorization assessment
- Vulnerability identification
- Best practice validation
- OWASP Top 10 coverage
- Penetration testing preparation
Architecture Analysis
- System design review
- Component mapping
- Integration pattern analysis
- Scalability assessment
- Technical debt evaluation
Code Quality Analysis
- Pattern detection
- Code smell identification
- Complexity metrics
- Refactoring opportunities
- Best practice adherence
Performance Analysis
- Bottleneck identification
- Algorithm complexity
- Resource usage patterns
- Optimization opportunities
- Benchmark analysis
Research Synthesis
- Multi-source research compilation
- Best practice identification
- Technology evaluation
- Pattern discovery
- Trend analysis
Comprehensive Reviews
- Pre-production audit
- System health check
- Compliance verification
- Documentation audit
- Knowledge transfer
Quick Start
Option 1: Automated Script
# Run complete analysis automatically
bash .claude/skills/multi-ai-research/scripts/analyze.sh "Security analysis of authentication system"
This will:
- Create analysis plan
- Launch parallel research (Claude + Gemini + Codex)
- Perform deep analysis
- Synthesize and verify
- Iterate if needed
- Generate final report
Option 2: Interactive Mode
Ask Claude Code to use this skill:
"Use multi-ai-research to analyze [objective]"
Claude will:
- Create comprehensive analysis plan
- Coordinate all three AI systems
- Synthesize findings
- Verify quality
- Iterate until ≥95 quality
- Deliver final report
The 5-Phase Pipeline
Phase 1: Planning & Strategy
Duration: 5-10 minutes
Output: .analysis/ANALYSIS_PLAN.md
Claude creates comprehensive plan:
- Defines objectives and scope
- Plans file reading strategy (glob → grep → read)
- Assigns tasks to AI systems
- Sets verification criteria
- Defines success thresholds
Phase 2: Parallel Research
Duration: 10-20 minutes
Output: .analysis/research/*.md
All three systems research simultaneously:
Claude Subagent:
- Official documentation analysis
- Codebase examination (progressive disclosure)
- Architecture mapping
- Pattern identification
Gemini CLI:
- Web research (latest 2024-2025)
- Community best practices
- Industry trends
- Common pitfalls
Codex CLI:
- GitHub pattern analysis
- Code examples from top repos
- Implementation references
- Testing strategies
Phase 3: Deep Analysis
Duration: 15-30 minutes
Output: .analysis/analysis/code-patterns.md
Claude Analysis Agent with extended thinking:
- Progressive codebase analysis
- Pattern recognition across sources
- Architecture mapping
- Metrics calculation
- Risk assessment
Phase 4: Synthesis & Verification
Duration: 10-20 minutes Outputs:
.analysis/SYNTHESIS_REPORT.md.analysis/verification/cross-check.md
Synthesis (Claude with extended thinking):
- Read all research findings
- Identify themes across sources
- Resolve contradictions
- Create unified narrative
- Full citations
Verification (Verification Subagent):
- 3-pass verification (completeness, accuracy, quality)
- Cross-source validation
- Citation checking
- Gap analysis
- Quality scoring
Phase 5: Iteration (if needed)
Duration: 10-30 minutes
Output: .analysis/iterations/ITERATION_2.md
If quality <95 or gaps exist:
- Targeted research for gaps
- Quality improvements
- Re-verification
- Repeat until ≥95
Phase 6: Final Report
Duration: 5-10 minutes
Output: .analysis/ANALYSIS_FINAL.md
Comprehensive final report:
- Executive summary
- Complete findings
- All sources synthesized
- Prioritized recommendations
- Implementation guidance
- Full citations
Total Time: 45-90 minutes for comprehensive analysis
Analysis Types
Security Analysis
What it checks:
- Authentication/authorization patterns
- Input validation
- Secret management
- Injection vulnerabilities (SQL, XSS, etc.)
- Dependency vulnerabilities
- Rate limiting
- Session security
Example:
Use multi-ai-research for "Security audit of authentication system"
Output:
- Critical/High/Medium/Low priority issues
- OWASP Top 10 coverage
- Code examples with file:line
- Specific remediation steps
- Industry best practices comparison
Architecture Analysis
What it examines:
- System components and boundaries
- Integration patterns
- Data flow
- Dependency relationships
- Scalability considerations
- Design patterns used
Example:
Use multi-ai-research for "Architecture analysis of microservices system"
Output:
- Component map with relationships
- Integration pattern analysis
- Scalability assessment
- Technical debt identification
- Refactoring recommendations
Code Quality Analysis
What it analyzes:
- Code patterns and organization
- Complexity metrics
- Code smells
- Best practice adherence
- Test coverage
- Documentation quality
Example:
Use multi-ai-research for "Code quality assessment for ./src"
Output:
- Quality score with breakdown
- Pattern analysis
- Refactoring priorities
- Specific code improvements
- Complexity hotspots
Performance Analysis
What it identifies:
- Algorithm complexity
- Bottlenecks
- Resource usage patterns
- Database query efficiency
- Network call patterns
Example:
Use multi-ai-research for "Performance bottleneck identification"
Output:
- Bottleneck analysis with file:line
- Optimization opportunities
- Before/after estimations
- Implementation guidance
Research Synthesis
What it compiles:
- Official documentation
- Web best practices
- GitHub patterns
- Industry standards
- Community insights
Example:
Use multi-ai-research for "Research GraphQL federation patterns 2024-2025"
Output:
- Multi-source synthesis
- Consensus findings (all sources agree)
- Multiple perspectives (sources differ)
- Code examples
- Implementation recommendations
How It Works
Progressive Disclosure
Never reads files blindly. Always uses 3-level approach:
Level 1: Metadata (glob) - ~50 tokens
glob "**/*.{ts,js,py}" # Understand structure
glob "**/*.md" # Find documentation
glob "**/package.json" # Check dependencies
Level 2: Patterns (grep) - ~5k tokens
grep "export class|interface" --glob "**/*.ts"
grep "TODO|FIXME|BUG" --glob "**/*"
grep "password|secret|token" --glob "**/*.ts"
Level 3: Reading (read) - ~50k tokens
read "src/auth/login.ts" # Only critical files
read "docs/architecture.md"
Result: 90%+ reduction in unnecessary file reads
External Memory Architecture
All state saved to files, not context:
.analysis/
├── ANALYSIS_PLAN.md # Strategy and assignments
├── research/
│ ├── claude-docs.md # Claude research
│ ├── gemini-web.md # Gemini research
│ └── codex-github.md # Codex research
├── analysis/
│ ├── code-patterns.md # Pattern analysis
│ └── architecture-map.md # System map
├── verification/
│ └── cross-check.md # Verification results
├── iterations/
│ ├── ITERATION_1.md # First pass
│ └── ITERATION_2.md # Gap fills
└── ANALYSIS_FINAL.md # Complete report
Benefits:
- Survives context window limits
- Enables multi-session analysis
- Resumable from any checkpoint
- No information loss
Cross-Validation Pattern
High Confidence (★★★★★): All 3 sources agree + code verification Medium Confidence (★★★☆☆): 2/3 sources agree Requires Investigation (★★☆☆☆): Sources conflict
Example:
## JWT Implementation (High Confidence ★★★★★)
**Claude**: "Uses JWT with HS256" (src/auth/jwt.ts:15)
**Gemini**: "HS256 is industry standard 2024" (URL)
**Codex**: "150+ repos use HS256 pattern" (GitHub)
**Code**: Verified at src/auth/jwt.ts:18-22
**Recommendation**: Implementation correct per standards
Quality Scoring
Comprehensive rubric (0-100):
- Comprehensiveness (/20): All aspects covered
- Accuracy (/20): All claims sourced and verified
- Specificity (/20): File:line precision, not vague
- Actionability (/20): Specific recommendations
- Consistency (/20): No contradictions
Quality Gates:
- ≥95: Production-ready
- 85-94: Needs minor refinement
- 75-84: Needs iteration
- <75: Requires rework
Iterative Refinement
Iteration 1 (Breadth): Broad coverage, identifies gaps Iteration 2 (Depth): Fill gaps, improve quality Iteration 3 (Polish): Final verification, perfection
Automatic iteration until:
- Quality ≥95
- Citation coverage = 100%
- Critical gaps = 0
AI System Roles
Claude Subagents (via Task tool)
Research Agent (Haiku):
- Progressive disclosure expert
- Documentation analysis
- Codebase examination
- Pattern detection
Analysis Agent (Sonnet):
- Extended thinking for synthesis
- Multi-source integration
- Pattern recognition
- Architectural insights
Verification Agent (Haiku):
- 3-pass verification
- Citation checking
- Gap analysis
- Quality scoring
Gemini CLI
Strengths:
- Native web search
- Latest trends (2024-2025)
- Community practices
- Multimodal analysis (if needed)
Use for:
- Best practice research
- Industry standards
- Latest vulnerabilities
- Framework comparisons
Codex CLI
Strengths:
- GitHub integration
- Code pattern search
- Deep reasoning (o3 model)
- Implementation examples
Use for:
- Code examples
- Design patterns
- Architecture reasoning
- Testing strategies
Configuration
Prerequisites
Required:
- Claude Code (with Task tool access)
Optional but Recommended:
- Gemini CLI:
npm install -g @google/gemini-cli - Codex CLI:
npm install -g @openai/codex
Note: Skill works with Claude-only fallback if Gemini/Codex unavailable.
Gemini CLI Setup
# Install
npm install -g @google/gemini-cli
# Authenticate (OAuth - free)
gemini
# Follow browser authentication
# Test
gemini -p "test prompt"
Codex CLI Setup
# Install
npm install -g @openai/codex
# Authenticate (ChatGPT Plus/Pro account)
codex login
# Follow browser authentication
# Test
codex exec "test prompt"
Model Selection
Claude:
- Haiku: Research & verification (fast, efficient)
- Sonnet: Analysis & synthesis (balanced)
- Opus: Complex reasoning (if needed)
Gemini:
- gemini-2.5-flash: Quick research
- gemini-2.5-pro: Complex analysis
Codex:
- gpt-5.1-codex: Standard tasks
- o3: Deep architectural reasoning
- o4-mini: Quick operations
Examples
Example 1: Security Analysis
Objective: "Security audit of authentication system"
Phase 2 - Parallel Research:
├─ Claude: Analyzes src/auth/* for patterns
├─ Gemini: Researches "OAuth 2.0 security best practices 2024"
└─ Codex: Finds GitHub examples of secure auth
Phase 3 - Analysis:
└─ Claude: Identifies 3 critical, 5 high priority issues
Phase 4 - Synthesis:
└─ All agree: Missing rate limiting (CRITICAL)
- Claude: No rate limit found in src/auth/login.ts
- Gemini: OWASP recommends max 5 attempts/hour
- Codex: 150+ repos use express-rate-limit
- Recommendation: Implement with Redis backend
Final Report:
├─ Executive summary
├─ 8 issues (3 critical, 5 high) with fixes
├─ OWASP Top 10 coverage
├─ Specific code examples
└─ Priority implementation plan
Quality: 97/100 ✓
Example 2: Architecture Analysis
Objective: "Analyze microservices architecture"
Phase 2:
├─ Claude: Maps services via glob + grep
├─ Gemini: Researches microservices patterns 2024
└─ Codex: Finds service mesh examples
Phase 3:
└─ Claude: Identifies 7 services, 12 integration points
Phase 4:
└─ Synthesis: Service communication patterns
- Consensus: REST for external, gRPC for internal
- Trade-offs documented
- Scaling strategies from Codex examples
Final Report:
├─ Component map (7 services, dependencies)
├─ Integration analysis (12 patterns)
├─ Scalability assessment
└─ Modernization recommendations
Quality: 96/100 ✓
Example 3: Research Synthesis
Objective: "Research state management patterns for React 2024"
Phase 2:
├─ Claude: Reviews React docs + examples
├─ Gemini: Web research "React state management 2024"
└─ Codex: Analyzes top 50 React repos
Phase 3:
└─ Pattern analysis: 5 major approaches identified
Phase 4:
└─ Synthesis by use case:
- Small apps: Context (all sources agree)
- Medium apps: Zustand (Gemini + Codex recommend)
- Large apps: Redux Toolkit (battle-tested, Codex data)
- Server state: TanStack Query (trending, Gemini research)
Final Report:
├─ Decision tree by project size
├─ Pros/cons with sources
├─ Migration strategies
└─ Code examples from Codex
Quality: 98/100 ✓
Best Practices
1. Be Specific with Objectives
❌ "Analyze the code"
✅ "Security analysis of authentication module for OWASP Top 10 compliance"
2. Trust the Verification
Multi-pass verification catches issues. If quality <95, iteration happens automatically.
3. Review External Memory
Check .analysis/ folder during execution to see progress.
4. Leverage Citations
Every claim has file:line or URL. Use for validation and deep dives.
5. Multi-Session Projects
Large projects can span sessions:
Session 1: Initial analysis → ITERATION_1.md
Session 2: Gap filling → ITERATION_2.md
Session 3: Final polish → ANALYSIS_FINAL.md
6. Check All Three Perspectives
High-value insights often come from comparing AI perspectives.
Troubleshooting
Low Quality Score (<95)
Cause: Gaps in coverage or missing citations
Solution: Automatic iteration 2 fills gaps
Check: .analysis/verification/cross-check.md for details
Missing Citations
Cause: Verification flags uncited claims Solution: Iteration adds missing attributions Prevention: All agents trained to cite sources
Gemini/Codex Unavailable
Fallback: Claude-only analysis with warning
Impact: Reduced perspectives but still comprehensive
Install: npm install -g @google/gemini-cli @openai/codex
Conflicting Information
Resolution: Synthesis phase investigates conflicts Method: Check ground truth (actual code/docs) Output: Documented reasoning for resolution
Related Skills
anthropic-expert: Anthropic product expertisecodex-cli: Codex integration patternsgemini-cli: Gemini integration patternstri-ai-collaboration: General tri-AI workflowsanalysis: Code/skill/process analysis
Quick Reference
Command Line
# Full automated analysis
bash .claude/skills/multi-ai-research/scripts/analyze.sh "objective"
# Interactive with Claude Code
# Just ask: "Use multi-ai-research for [objective]"
File Locations
| File | Purpose |
|---|---|
.analysis/ANALYSIS_PLAN.md |
Strategy and assignments |
.analysis/research/ |
All AI research outputs |
.analysis/SYNTHESIS_REPORT.md |
Multi-source synthesis |
.analysis/ANALYSIS_FINAL.md |
Complete final report |
Quality Metrics
| Metric | Threshold | Meaning |
|---|---|---|
| Quality Score | ≥95/100 | Production-ready |
| Citation Coverage | 100% | All claims sourced |
| Completeness | ≥95% | All objectives met |
| Critical Gaps | 0 | No missing essentials |
Analysis Time Estimates
| Type | Time | Iterations |
|---|---|---|
| Security | 45-60 min | 1-2 |
| Architecture | 60-90 min | 1-2 |
| Code Quality | 30-45 min | 1 |
| Performance | 45-60 min | 1-2 |
| Research | 30-60 min | 1 |
multi-ai-research delivers production-ready analysis through systematic multi-AI collaboration, rigorous verification, and iterative refinement - ensuring nothing is missed and every claim is verified.