fact-check
Fact Check — Primary Source Claim Verification
Verify factual claims against primary sources using web lookups. Training data recall is NOT evidence.
When to Use
- Backlog items marked
UNVERIFIED - Plugin documentation with uncited claims about tools, APIs, CLI flags
- Skills that reference specific software behavior without citation
- After code review agents flag potential fabrication
When NOT to Use
- Structural findings (broken links, missing files, malformed YAML) — those are filesystem checks, not fact checks
- Code logic bugs — use
/find-causeinstead - Research on new tools — use
/research-curatorinstead
Evidence Rules
<evidence_rules>
Adapted from the find-cause evidence chain protocol.
Valid evidence (one of these MUST support each verdict):
- WebFetch output — content retrieved from an official documentation URL
- WebSearch result — search results linking to authoritative sources
- Command output —
npx <tool> --help,gh api, or similar CLI output - Repository source code — file content from the tool's GitHub repo via
ghor clone - MCP tool output —
mcp__Ref,mcp__exaresults with URLs
NOT valid evidence (these MUST NOT support verdicts):
- Training data recall ("I know this from my training")
- Inference from absence ("the docs don't mention it, so it doesn't exist")
- Reasoning from analogy ("tool X works this way, so tool Y does too")
- Documentation that describes intent without observed behavior
- Another AI's claim without primary source backing
</evidence_rules>
Claim Extraction
Parse the input to extract discrete, falsifiable claims:
flowchart TD
Start([Parse input]) --> Q{Input type?}
Q -->|Backlog item title| ReadBacklog[Run backlog list --format json, find matching item]
Q -->|Plugin path| ScanFiles[Scan SKILL.md and references for factual claims]
Q -->|--all-unverified| FindAll[Run backlog list --format json, filter UNVERIFIED items]
ReadBacklog --> Extract[Extract falsifiable claims]
ScanFiles --> Extract
FindAll --> Extract
Extract --> Classify[Classify each claim]
Classify --> Spawn[Spawn verification agents]
Claim Classification
For each claim, determine:
- Claim text — the specific assertion
- Primary source — where to check (official docs URL, GitHub repo, CLI help)
- Verification method — WebFetch URL, run CLI command, search GitHub issues
- Falsifiability — what would disprove this claim?
Verification Agent Spawning
Spawn @fact-checker agents in parallel waves of 5.
Each agent receives:
CLAIM: {exact claim text}
SOURCE_FILE: {file containing the claim, with line numbers}
PRIMARY_SOURCE: {URL or command to check against}
VERIFICATION_METHOD: {WebFetch | WebSearch | CLI command | gh API}
FALSIFICATION_CRITERIA: {what would disprove this}
Wave Execution
flowchart TD
Start([Claims extracted]) --> Count{How many claims?}
Count -->|1-5| Wave1[Wave 1 — all in parallel]
Count -->|6-10| Split2[Wave 1 — first 5, Wave 2 — remaining]
Count -->|11+| SplitN[Waves of 5, sequential]
Wave1 --> Collect[Collect verdicts]
Split2 --> SeqWaves[Execute waves sequentially]
SeqWaves --> Collect
SplitN --> SeqWaves
Collect --> Report[Generate report]
Verdict Format
Each agent returns a structured verdict:
CLAIM: {the claim being checked}
VERDICT: VERIFIED | REFUTED | INCONCLUSIVE
EVIDENCE:
- Source: {URL or command}
- Retrieved: {date}
- Content: {relevant excerpt from primary source}
EXPLANATION: {how the evidence supports the verdict}
CITATION: {formatted citation for backlog/docs update}
Verdict Criteria
- VERIFIED — primary source confirms the claim. Evidence excerpt matches.
- REFUTED — primary source contradicts the claim. Evidence excerpt shows the discrepancy.
- INCONCLUSIVE — primary source could not be reached, does not address the claim, or is ambiguous. State what additional step would resolve it.
Chain of Verification (CoVe) Requirement
Each agent MUST apply CoVe before finalizing:
- Initial lookup — fetch primary source, form initial verdict
- Verification questions — generate 2-3 questions that could falsify the initial verdict
- Independent check — answer each verification question using a different source or method
- Final verdict — confirm or revise based on cross-checking
This prevents the agent from confirming its own bias in a single lookup.
Report Format
After all waves complete:
# Fact Check Report
**Date**: {YYYY-MM-DD}
**Scope**: {backlog item title | plugin path | all-unverified}
**Claims checked**: {N}
## Summary
| Verdict | Count |
|---------|-------|
| VERIFIED | {N} |
| REFUTED | {N} |
| INCONCLUSIVE | {N} |
## Verdicts
### Claim 1: {claim text}
**Verdict**: {VERIFIED|REFUTED|INCONCLUSIVE}
**Source**: {primary source URL}
**Evidence**: {excerpt}
**Citation**: {formatted citation}
### Claim 2: ...
## Backlog Updates
{For each REFUTED or VERIFIED claim, the specific edit to make to the per-item file}
Post-Actions
- Update backlog — for VERIFIED/REFUTED claims, update the backlog item's Status field in the per-item file
- Lint —
uv run prek run --files .claude/backlog/ - Commit —
git add .claude/backlog/ && git commit -m "docs(backlog): fact-check {N} claims ({date})"
For INCONCLUSIVE claims, add a note to the backlog item describing what additional verification is needed.
References
- Evidence chain protocol adapted from find-cause
- Wave spawning pattern from research-curator
- Anti-hallucination checkpoint from skill-research-process
- Chain of Verification from cove-prompt-design
More from jamie-bitflight/claude_skills
perl-lint
This skill should be used when the user asks to lint Perl code, run perlcritic, check Perl style, format Perl code, run perltidy, or mentions Perl Critic policies, code formatting, or style checking.
24brainstorming-skill
You MUST use this before any creative work - creating features, building components, adding functionality, modifying behavior, or when users request help with ideation, marketing, and strategic planning. Explores user intent, requirements, and design before implementation using 30+ research-validated prompt patterns.
11design-anti-patterns
Enforce anti-AI UI design rules based on the Uncodixfy methodology. Use when generating HTML, CSS, React, Vue, Svelte, or any frontend UI code. Prevents "Codex UI" — the generic AI aesthetic of soft gradients, floating panels, oversized rounded corners, glassmorphism, hero sections in dashboards, and decorative copy. Applies constraints from Linear/Raycast/Stripe/GitHub design philosophy: functional, honest, human-designed interfaces. Triggers on: UI generation, dashboard building, frontend component creation, CSS styling, landing page design, or any task producing visual interface code.
7python3-review
Comprehensive Python code review checking patterns, types, security, and performance. Use when reviewing Python code for quality issues, when auditing code before merge, or when assessing technical debt in a Python codebase.
7hooks-guide
Cross-platform hooks reference for AI coding assistants — Claude Code, GitHub Copilot, Cursor, Windsurf, Amp. Covers hook authoring in Node.js CJS and Python, per-platform event schemas, inline-agent hooks and MCP in agent frontmatter, common JSON I/O, exit codes, best practices, and a fetch script to refresh docs from official sources. Use when writing, reviewing, or debugging hooks for any AI assistant.
7agent-creator
Create high-quality Claude Code agents from scratch or by adapting existing agents as templates. Use when the user wants to create a new agent, modify agent configurations, build specialized subagents, or design agent architectures. Guides through requirements gathering, template selection, and agent file generation following Anthropic best practices (v2.1.63+).
6