Domain Research: Health Science

Workflow
Common Patterns
Guardrails
Quick Reference

Workflow

Copy this checklist and track your progress:

Health Research Progress:
- [ ] Step 1: Formulate research question (PICOT)
- [ ] Step 2: Assess evidence hierarchy and study design
- [ ] Step 3: Evaluate study quality and bias
- [ ] Step 4: Prioritize and define outcomes
- [ ] Step 5: Synthesize evidence and grade certainty
- [ ] Step 6: Create decision-ready summary

Step 1: Formulate research question (PICOT)

Use PICOT framework to structure answerable clinical question. Define Population (demographics, condition, setting), Intervention (treatment, exposure, diagnostic test), Comparator (alternative treatment, placebo, standard care), Outcome (patient-important endpoints), and Timeframe (follow-up duration). See resources/template.md for structured templates.

Step 2: Assess evidence hierarchy and study design

Determine appropriate study design based on research question type (therapy: RCT; diagnosis: cross-sectional; prognosis: cohort; harm: case-control or cohort). Understand hierarchy of evidence (systematic reviews > RCTs > cohort > case-control > case series). See resources/methodology.md for design selection guidance.

Step 3: Evaluate study quality and bias

Apply risk of bias assessment tools (Cochrane RoB 2 for RCTs, ROBINS-I for observational studies, QUADAS-2 for diagnostic accuracy). Evaluate randomization, blinding, allocation concealment, incomplete outcome data, selective reporting. See resources/methodology.md for detailed criteria.

Step 4: Prioritize and define outcomes

Distinguish patient-important outcomes (mortality, symptoms, quality of life, function) from surrogate endpoints (biomarkers, lab values). Create outcome hierarchy: critical (decision-driving), important (informs decision), not important. Define measurement instruments and minimal clinically important differences (MCID). See resources/template.md for prioritization framework.

Step 5: Synthesize evidence and grade certainty

Apply GRADE (Grading of Recommendations Assessment, Development and Evaluation) to rate certainty of evidence (high, moderate, low, very low). Consider study limitations, inconsistency, indirectness, imprecision, publication bias. Upgrade for large effects, dose-response, or confounders reducing effect. See resources/methodology.md for rating guidance.

Step 6: Create decision-ready summary

Produce evidence profile or summary of findings table linking outcomes to certainty ratings and effect estimates. Include clinical interpretation, applicability assessment, and evidence gaps. Validate using resources/evaluators/rubric_domain_research_health_science.json. Minimum standard: Average score ≥ 3.5.

Common Patterns

Pattern 1: Therapy/Intervention Question

PICOT: Adults with condition → new treatment vs standard care → patient-important outcomes → follow-up period
Study design: RCT preferred (highest quality for causation); systematic review of RCTs for synthesis
Key outcomes: Mortality, morbidity, quality of life, adverse events
Bias assessment: Cochrane RoB 2 (randomization, blinding, attrition, selective reporting)
Example: SGLT2 inhibitors for heart failure → reduced mortality (GRADE: high certainty)

Pattern 2: Diagnostic Test Accuracy

PICOT: Patients with suspected condition → new test vs reference standard → sensitivity/specificity → cross-sectional
Study design: Cross-sectional study with consecutive enrollment; avoid case-control (inflates accuracy)
Key outcomes: Sensitivity, specificity, positive/negative predictive values, likelihood ratios
Bias assessment: QUADAS-2 (patient selection, index test, reference standard, flow and timing)
Example: High-sensitivity troponin for MI → sensitivity 95%, specificity 92% (GRADE: moderate certainty)

Pattern 3: Prognosis/Risk Prediction

PICOT: Population with condition/exposure → risk factors → outcomes (death, disease progression) → long-term follow-up
Study design: Prospective cohort (follow from exposure to outcome); avoid retrospective (recall bias)
Key outcomes: Incidence, hazard ratios, absolute risk, risk prediction model performance (C-statistic, calibration)
Bias assessment: ROBINS-I or PROBAST (for prediction models)
Example: Framingham Risk Score for CVD → C-statistic 0.76 (moderate discrimination)

Pattern 4: Harm/Safety Assessment

PICOT: Population exposed to intervention → adverse events → timeframe for rare/delayed harms
Study design: RCT for common harms; observational (cohort, case-control) for rare harms (larger sample, longer follow-up)
Key outcomes: Serious adverse events, discontinuations, organ-specific toxicity, long-term safety
Bias assessment: Different for rare vs common harms; consider confounding by indication in observational studies
Example: NSAID cardiovascular risk → observational studies show increased MI risk (GRADE: low certainty due to confounding)

Pattern 5: Systematic Review/Meta-Analysis

PICOT: Defined in protocol; guides search strategy, inclusion criteria, outcome extraction
Study design: Comprehensive search, explicit eligibility criteria, duplicate screening/extraction, bias assessment, quantitative synthesis (if appropriate)
Key outcomes: Pooled effect estimates (RR, OR, MD, SMD), heterogeneity (I²), certainty rating (GRADE)
Bias assessment: Individual study RoB + review-level assessment (AMSTAR 2 for review quality)
Example: Statins for primary prevention → RR 0.75 for MI (95% CI 0.70-0.80, I²=12%, GRADE: high certainty)

Guardrails

Key requirements:

Use PICOT for all clinical questions: Vague questions lead to unfocused research. Specify Population, Intervention, Comparator, Outcome, Timeframe explicitly rather than asking "does X work?" without defining for whom, compared to what, and measuring which outcomes.
Match study design to question type: RCTs answer therapy questions (causal inference). Cohort studies answer prognosis. Cross-sectional studies answer diagnosis. Case-control studies answer rare harm or etiology. Avoid claiming causation from observational data or using case series for treatment effects.
Prioritize patient-important outcomes over surrogates: Surrogate endpoints (biomarkers, lab values) do not always correlate with patient outcomes. Focus on mortality, morbidity, symptoms, function, quality of life. Only use surrogates when a validated relationship to patient outcomes exists.
Assess bias systematically: Use validated tools (Cochrane RoB 2, ROBINS-I, QUADAS-2) rather than subjective judgment, because bias assessment directly affects certainty of evidence and clinical recommendations. Common biases: selection bias, performance bias (lack of blinding), detection bias, attrition bias, reporting bias.
Apply GRADE to rate certainty of evidence: Avoid conflating study design with certainty. RCTs start as high certainty but can be downgraded (serious limitations, inconsistency, indirectness, imprecision, publication bias). Observational studies start as low but can be upgraded (large effect, dose-response, residual confounding reducing effect).
Distinguish statistical significance from clinical importance: p < 0.05 does not mean clinically meaningful. Consider minimal clinically important difference (MCID), absolute risk reduction, number needed to treat (NNT). A small p-value with tiny effect size is statistically significant but clinically irrelevant.
Assess external validity and applicability: Evidence from selected trial populations may not apply to the target patient. Consider PICO match, setting differences (tertiary center vs community), intervention feasibility, patient values and preferences.
State limitations and certainty explicitly: All evidence has limitations. Specify what is uncertain, where evidence gaps exist, and how this affects confidence in recommendations.

Common pitfalls:

❌ Treating all RCTs as high quality: RCTs can have serious bias (inadequate randomization, unblinded, high attrition). Always assess bias.
❌ Ignoring heterogeneity in meta-analysis: High I² (>50%) suggests important differences across studies. Explore sources (population, intervention, outcome definition) before pooling.
❌ Confusing association with causation: Observational studies show association, not causation. Residual confounding is always possible.
❌ Using composite outcomes uncritically: Composite endpoints (e.g., "death or MI or hospitalization") obscure which component drives effect. Report components separately.
❌ Accepting industry-funded evidence uncritically: Pharmaceutical/device company-sponsored trials may have bias (outcome selection, selective reporting). Assess for conflicts of interest.
❌ Over-interpreting subgroup analyses: Most subgroup effects are chance findings. Only credible if pre-specified, statistically tested for interaction, and biologically plausible.

Quick Reference

Key resources:

resources/template.md: PICOT framework, outcome hierarchy template, evidence table, GRADE summary template
resources/methodology.md: Evidence hierarchy, bias assessment tools, GRADE detailed guidance, study design selection, systematic review methods
resources/evaluators/rubric_domain_research_health_science.json: Quality criteria for research questions, evidence synthesis, and clinical interpretation

PICOT Template:

P (Population): [Who? Age, sex, condition, severity, setting]
I (Intervention): [What? Drug, procedure, test, exposure - dose, duration, route]
C (Comparator): [Compared to what? Placebo, standard care, alternative treatment]
O (Outcome): [What matters? Mortality, symptoms, QoL, harms - measurement instrument, timepoint]
T (Timeframe): [How long? Follow-up duration, time to outcome]

Evidence Hierarchy (Therapy Questions):

Systematic reviews/meta-analyses of RCTs
Individual RCTs (large, well-designed)
Cohort studies (prospective)
Case-control studies
Case series, case reports
Expert opinion, pathophysiologic rationale

GRADE Certainty Ratings:

High (⊕⊕⊕⊕): Very confident true effect is close to estimated effect
Moderate (⊕⊕⊕○): Moderately confident, true effect likely close but could be substantially different
Low (⊕⊕○○): Limited confidence, true effect may be substantially different
Very Low (⊕○○○): Very little confidence, true effect likely substantially different

Typical workflow time:

PICOT formulation: 10-15 minutes
Single study critical appraisal: 20-30 minutes
Systematic review protocol: 2-4 hours
Evidence synthesis with GRADE: 1-2 hours
Full systematic review: 40-100 hours (depending on scope)

When to escalate:

Complex statistical meta-analysis (network meta-analysis, IPD meta-analysis)
Advanced causal inference methods (instrumental variables, propensity scores)
Health technology assessment (cost-effectiveness, budget impact)
Guideline development panels (requires multi-stakeholder consensus) → Consult biostatistician, health economist, or guideline methodologist

Inputs required:

Research question (clinical scenario or decision problem)
Evidence sources (studies to appraise, databases for systematic review)
Outcome preferences (which outcomes matter most to patients/clinicians)
Context (setting, patient population, decision urgency)

Outputs produced:

domain-research-health-science.md: Structured research question, evidence appraisal, outcome hierarchy, certainty assessment, clinical interpretation

domain-research-health-science

Domain Research: Health Science

Table of Contents

Workflow

Common Patterns

Guardrails

Quick Reference