network-meta-analysis-appraisal
Network Meta-Analysis Comprehensive Appraisal
Overview
This skill enables systematic, reproducible appraisal of network meta-analysis (NMA) papers through:
- Automated PDF intelligence - Extract text, tables, and statistical content from NMA PDFs
- Semantic evidence matching - Map 200+ checklist criteria to PDF content using AI similarity
- Triple-validation methodology - Two independent concurrent appraisals + meta-review consensus
- Comprehensive frameworks - PRISMA-NMA, NICE DSU TSD 7, ISPOR-AMCP-NPC, CINeMA integration
- Professional reports - Generate markdown checklists and structured YAML outputs
The skill transforms a complex, time-intensive manual process (~6-8 hours) into a systematic, partially-automated workflow (~2-3 hours).
When to Use This Skill
Apply this skill when:
- Conducting peer review for journal submissions containing NMA
- Evaluating evidence for clinical guideline development
- Assessing NMA for health technology assessment (HTA)
- Reviewing NMA for reimbursement/formulary decisions
- Training on systematic NMA critical appraisal methodology
- Comparing Bayesian vs Frequentist NMA approaches
Workflow: PDF to Appraisal Report
Follow this sequential 5-step workflow for comprehensive appraisal:
Step 1: Setup & Prerequisites
Install Required Libraries:
cd scripts/
pip install -r requirements.txt
# Download semantic model (first time only)
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"
Verify Checklist Availability:
Confirm all 8 checklist sections are in references/checklist_sections/:
- SECTION I - STUDY RELEVANCE and APPLICABILITY.md
- SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md
- SECTION III - METHODOLOGICAL RIGOR - NICE DSU TSD 7.md
- SECTION IV - CREDIBILITY ASSESSMENT - ISPOR-AMCP-NPC.md
- SECTION V - CERTAINTY OF EVIDENCE - CINeMA Framework.md
- SECTION VI - SYNTHESIS and OVERALL JUDGMENT.md
- SECTION VII - APPRAISER INFORMATION.md
- SECTION VIII - APPENDICES.md
Select Framework Scope:
Choose based on appraisal purpose (see references/frameworks_overview.md for details):
comprehensive: All 4 frameworks (~200 items, 4-6 hours)reporting: PRISMA-NMA only (~90 items, 2-3 hours)methodology: NICE + CINeMA (~30 items, 2-3 hours)decision: Relevance + ISPOR + CINeMA (~30 items, 2-3 hours)
Step 2: Extract PDF Content
Run pdf_intelligence.py to extract structured content from the NMA paper:
python scripts/pdf_intelligence.py path/to/nma_paper.pdf --output pdf_extraction.json
What This Does:
- Extracts text with section detection (abstract, methods, results, discussion)
- Parses tables using multiple libraries (Camelot, pdfplumber)
- Extracts metadata (title, page count, etc.)
- Calculates extraction quality scores
Outputs:
pdf_extraction.json- Structured PDF content for evidence matching
Quality Check:
- Verify
extraction_qualityscores ≥ 0.6 for text_coverage and sections_detected - Low scores indicate poor PDF quality - may require manual supplementation
Step 3: Match Evidence to Checklist Criteria
Prepare Checklist Criteria JSON: Extract checklist items from markdown sections into machine-readable format:
import json
from pathlib import Path
# Example: Extract criteria from Section II
criteria = []
section_file = Path("references/checklist_sections/SECTION II - REPORTING TRANSPARENCY and COMPLETENESS - PRISMA-NMA.md")
# Parse markdown table rows to extract item IDs and criteria text
# Format: [{"id": "4.1", "text": "Does the title identify the study as a systematic review and network meta-analysis?"},...]
Path("checklist_criteria.json").write_text(json.dumps(criteria, indent=2))
Run Semantic Evidence Matching:
python scripts/semantic_search.py pdf_extraction.json checklist_criteria.json --output evidence_matches.json
What This Does:
- Encodes each checklist criterion as semantic vector
- Searches PDF sections for matching paragraphs
- Calculates similarity scores (0.0-1.0)
- Assigns confidence levels (high/moderate/low/unable)
Outputs:
evidence_matches.json- Evidence mapped to each criterion with confidence scores
Step 4: Conduct Triple-Validation Appraisal
Manual Appraisal with Evidence Support:
For each checklist section:
-
Load evidence matches for that section's criteria
-
Review PDF content highlighted by semantic search
-
Apply triple-validation methodology (see
references/triple_validation_methodology.md):Appraiser #1 (Critical Reviewer):
- Evidence threshold: 0.75 (high)
- Stance: Skeptical, conservative
- For each item: Assign rating (✓/⚠/✗/N/A) based on evidence quality
Appraiser #2 (Methodologist):
- Evidence threshold: 0.70 (moderate)
- Stance: Technical rigor emphasis
- For each item: Assign rating independently
-
Meta-Review Concordance Analysis:
- Compare ratings between appraisers
- Calculate agreement levels (perfect/minor/major discordance)
- Apply resolution strategy (evidence-weighted by default)
- Flag major discordances for manual review
Structure Appraisal Results:
{
"pdf_metadata": {...},
"appraisal": {
"sections": [
{
"id": "section_ii",
"name": "REPORTING TRANSPARENCY & COMPLETENESS",
"items": [
{
"id": "4.1",
"criterion": "Title identification...",
"rating": "✓",
"confidence": "high",
"evidence": "The title explicitly states...",
"source": "methods section",
"appraiser_1_rating": "✓",
"appraiser_2_rating": "✓",
"concordance": "perfect"
},
...
]
},
...
]
}
}
Save as appraisal_results.json.
Step 5: Generate Reports
Create Markdown and YAML Reports:
python scripts/report_generator.py appraisal_results.json --format both --output-dir ./reports
Outputs:
reports/nma_appraisal_report.md- Human-readable checklist with ratings, evidence, concordancereports/nma_appraisal_report.yaml- Machine-readable structured data
Report Contents:
- Executive summary with overall quality ratings
- Detailed checklist tables (all 8 sections)
- Concordance analysis summary
- Recommendations for decision-makers and authors
- Evidence citations and confidence scores
Quality Validation:
- Review major discordance items flagged in concordance analysis
- Verify evidence confidence ≥ moderate for ≥50% of items
- Check overall agreement rate ≥ 65%
- Manually review any critical items with low confidence
Methodological Decision Points
Bayesian vs Frequentist Detection
The skill automatically detects statistical approach by scanning for keywords:
Bayesian Indicators: MCMC, posterior, prior, credible interval, WinBUGS, JAGS, Stan, burn-in, convergence diagnostic Frequentist Indicators: confidence interval, p-value, I², τ², netmeta, prediction interval
Apply appropriate checklist items based on detected approach:
- Item 18.3 (Bayesian specifications) - only if Bayesian detected
- Items on heterogeneity metrics (I², τ²) - primarily Frequentist
- Convergence diagnostics - only Bayesian
Handling Missing Evidence
When semantic search returns low confidence (<0.45):
- Manually search PDF for the criterion
- Check supplementary materials (if accessible)
- If truly absent, rate as ⚠ or ✗ depending on item criticality
- Document "No evidence found in main text" in evidence field
Resolution Strategy Selection
Choose concordance resolution strategy based on appraisal purpose:
- Evidence-weighted (default): Most objective, prefers stronger evidence
- Conservative: For high-stakes decisions (regulatory submissions)
- Optimistic: For formative assessments or educational purposes
See references/triple_validation_methodology.md for detailed guidance.
Resources
scripts/
Production-ready Python scripts for automated tasks:
- pdf_intelligence.py - Multi-library PDF extraction (PyMuPDF, pdfplumber, Camelot)
- semantic_search.py - AI-powered evidence-to-criterion matching
- report_generator.py - Markdown + YAML report generation
- requirements.txt - Python dependencies
Usage: Scripts can be run standalone via CLI or orchestrated programmatically.
references/
Comprehensive documentation for appraisal methodology:
- checklist_sections/ - All 8 integrated checklist sections (PRISMA/NICE/ISPOR/CINeMA)
- frameworks_overview.md - Framework selection guide, rating scales, key references
- triple_validation_methodology.md - Appraiser roles, concordance analysis, resolution strategies
Usage: Load relevant references when conducting specific appraisal steps or interpreting results.
Best Practices
- Always run pdf_intelligence.py first - Extraction quality affects all downstream steps
- Review low-confidence matches manually - Semantic search is not perfect
- Document resolution rationale - For major discordances, explain meta-review decision
- Maintain appraiser independence - Conduct Appraiser #1 and #2 evaluations without cross-reference
- Validate critical items - Manually verify evidence for high-impact methodological criteria
- Use appropriate framework scope - Comprehensive for peer review, targeted for specific assessments
Limitations
- PDF quality dependent: Poor scans or complex layouts reduce extraction accuracy
- Semantic matching not perfect: May miss evidence phrased in unexpected ways
- No external validation: Cannot verify PROSPERO registration or check author COI databases
- Language: Optimized for English-language papers
- Human oversight required: Final appraisal should be reviewed by domain expert
More from zpankz/mcp-skillset
software-architecture
Guide for quality focused software architecture. This skill should be used when users want to write code, design architecture, analyze code, in any case that relates to software development.
13cursor-skills
Cursor is an AI-powered code editor and development environment that combines intelligent coding assistance with enterprise-grade features and workflow automation. It extends beyond basic AI code comp...
13textbook-grounding
Orthogonally-integrated Hegelian syntopical analysis for SAQ/VIVA/concept grounding with systematic textbook citations. Implements thesis extraction → antithesis identification → abductive synthesis across multiple authoritative sources. Tensor-integrated with /m command: activates S×T×L synergies (textbook-grounding × pdf-search × qmd = 0.95). Triggers on requests for model SAQ responses, VIVA preparation, concept explanations requiring textbook evidence, or any PEX exam content needing systematic cross-reference validation.
12obsidian-process
This skill should be used when batch processing Obsidian markdown vaults. Handles wikilink extraction, tag normalization, frontmatter CRUD operations, and vault analysis. Use for vault-wide transformations, link auditing, tag standardization, metadata management, and migration workflows. Integrates with obsidian-markdown for syntax validation and obsidian-data-importer for structured imports.
12terminal-ui-design
Create distinctive, production-grade terminal user interfaces with high design quality. Use this skill when the user asks to build CLI tools, TUI applications, or terminal-based interfaces. Generates creative, polished code that avoids generic terminal aesthetics.
10agent-observability
|
10