verifier
Verifier
You help researchers verify claims and quotes in their manuscripts against source materials. Given a draft manuscript and source documents, you systematically confirm that quoted text and attributed claims actually appear in the sources.
Project Integration
This skill reads from project.yaml when available:
# From project.yaml
type: qualitative # or quantitative, mixed
paths:
# For qualitative
transcripts: data/raw/
# For quantitative
raw_data: data/raw/
scripts: scripts/analysis/
Project type: This skill works for all project types:
- Qualitative: Verifies participant quotes against transcripts
- Quantitative: Verifies statistical claims against data/scripts
- Mixed: Handles both verification types
Updates progress.yaml when complete:
status:
verification: done
artifacts:
verification_report: verification/verification-report.md
File Management
This skill uses git to track progress across phases. Before modifying any output file at a new phase:
- Stage and commit current state:
git add [files] && git commit -m "verifier: Phase N complete" - Then proceed with modifications.
Do NOT create version-suffixed copies (e.g., -v2, -final, -working). The git history serves as the version trail.
What This Skill Does
This is a verification skill that catches errors before they become problems:
- Extract all direct quotes and verifiable claims from a manuscript
- Map each item to its purported source (interview, article, document)
- Verify each item is present in the source using efficient search
- Escalate to deep reading (haiku agent) when fast search fails
- Report verification status with issues flagged for review
When to Use This Skill
Use this skill when you have:
- A manuscript with quotes attributed to interview participants or literature
- Source materials available (transcripts, PDFs, or document folder)
- A need to confirm accuracy before submission
Common scenarios:
- Final check before journal submission
- After revisions that moved or edited quotes
- When using quotes from secondary sources
- Quality assurance for interview-based findings sections
Source Types Supported
Interview Transcripts
- Participant quotes with pseudonyms
- Claims about what participants said/did/felt
- Aggregate claims ("Most participants...")
- Paraphrased attributions
Literature Sources
- Direct quotes from cited works
- Paraphrased claims with citations
- Data or statistics attributed to sources
- Theoretical claims linked to specific authors
Verification Levels
| Level | What's Checked | Tolerance |
|---|---|---|
| Exact quote | Verbatim match in source | Must match character-for-character (allowing minor punctuation variance) |
| Near quote | Quote with editorial changes | Marked edits ([...], [sic]) should reflect actual source |
| Paraphrase | Claimed meaning present | Source must support the paraphrased claim |
| Aggregate claim | Pattern across sources | Multiple sources should support the claim |
Workflow Phases
Phase 0: Intake
Goal: Understand the manuscript and source materials.
Process:
- Read the manuscript (or specified sections)
- Identify source type: interviews, literature, or mixed
- Locate source materials (folder path, references.bib + library/, or file list)
- Confirm verification scope (all quotes, specific sections, etc.)
- Count approximate items to verify
Output: Appends ## Scope Summary section to verification-report.md.
Pause: User confirms scope and source locations.
Phase 1: Extraction
Goal: Extract all verifiable items from the manuscript.
Process:
- Identify direct quotes (text in quotation marks with attribution)
- Identify paraphrased claims with source attribution
- Identify aggregate claims about participants or literature
- For each item, extract:
- The quote or claim text
- The attributed source (participant name, author/year)
- Location in manuscript (section, approximate position)
- Verification level (exact, near, paraphrase, aggregate)
- Create extraction database
Output: Appends ## Verification Items section to verification-report.md.
Pause: User reviews extracted items. Can mark items to skip.
Phase 2: Source Mapping
Goal: Map each item to its specific source document.
Process:
- For interview quotes: Match participant pseudonym to transcript file
- For literature: Match citation to PDF/document or references.bib entry
- Flag unmapped items:
- Participant not found in transcript list
- Citation not found in source materials
- Ambiguous source references
- Create source-to-item mapping
Output: Updates ## Verification Items section in verification-report.md with source mappings.
Pause: User resolves unmapped items.
Phase 3: Verification
Goal: Systematically verify each item against its source.
Process: For each item:
-
Fast search (Grep tool):
- For exact quotes: Search for distinctive phrase (8-15 words)
- For paraphrases: Search for key terms that must appear
- If found: Mark VERIFIED with source location
-
If not found, semantic search (RAG):
- Run
uv run plugins/sociology-skillset/scripts/rag.py search "paraphrased claim or key concept" - Especially useful for paraphrases where exact keywords differ from source
- If found: Mark VERIFIED with source location and note on variation
- Run
-
If not found, fuzzy search:
- Try variations (punctuation, spacing, common OCR errors)
- Search for partial matches (beginning/end of quote)
- If found: Mark VERIFIED with notes on variation
-
If still not found, deep reading (haiku agent):
- Spawn haiku agent with source document and search target
- Agent reads document looking for semantic match
- Agent returns: FOUND (with location), NOT FOUND, or PARTIAL MATCH
-
Record result:
- VERIFIED: Exact or acceptable match found
- PARTIAL: Quote/claim partially matches source
- NOT FOUND: Could not locate in purported source
- NEEDS REVIEW: Ambiguous or requires human judgment
Verification strategies by type:
| Type | Fast Search | Semantic Search | Deep Read Trigger |
|---|---|---|---|
| Exact quote | Full phrase grep | — | No match after fuzzy |
| Near quote | Core phrase grep | rag.py search with key concept |
Partial match only |
| Paraphrase | Key terms grep | rag.py search with paraphrased claim |
Terms found but context unclear |
| Aggregate | Count matching instances | — | Need pattern confirmation |
Output: Appends ## Verification Results section to verification-report.md.
Pause: After each batch of ~20 items to show progress.
Phase 4: Report
Goal: Complete the verification report with full accounting and recommendations.
Process:
- Summarize verification results:
- Total items verified
- Items by status (verified, partial, not found, needs review)
- Breakdown by source type
- Detail issues:
- NOT FOUND items with context and recommendations
- PARTIAL matches with specific discrepancies
- NEEDS REVIEW items with decision prompts
- Provide fix recommendations:
- Quote corrections with source text
- Missing attribution suggestions
- Items to remove or rewrite
Output: Appends ## Verification Report section to verification-report.md, completing the document.
Verification Search Strategy
Fast Search (Grep)
For a quote like:
"I didn't really think about it until my kids started asking questions" (Maria)
Search strategy:
1. Primary: "didn't really think about it until my kids"
2. Fallback: "think about it until" AND "kids" AND "questions"
3. Fuzzy: "did not really think" OR "didnt really think"
Deep Reading (Haiku Agent)
When grep fails, spawn an agent:
Task: Verify quote in source
subagent_type: general-purpose
model: haiku
prompt: |
Read this interview transcript and find if this quote (or close variant) appears.
QUOTE TO FIND:
"I didn't really think about it until my kids started asking questions"
ATTRIBUTED TO: Maria
TRANSCRIPT:
[transcript content]
Return:
- FOUND: [exact text from transcript] at [location]
- PARTIAL: [what you found] - differs in [how]
- NOT FOUND: Quote does not appear in this transcript
Common Issues Caught
| Issue | Detection | Recommendation |
|---|---|---|
| Quote not in transcript | NOT FOUND after deep read | Check attribution or remove quote |
| Quote from wrong participant | Found but different speaker | Correct attribution |
| Quote significantly altered | PARTIAL match | Revise to match source |
| Merged quotes | Parts from multiple places | Split or acknowledge composite |
| Citation to wrong source | Claim not in cited work | Find correct source |
File Structure
project/
├── manuscript/
│ └── draft.md # Document with quotes/claims
├── sources/
│ ├── interviews/ # Interview transcripts
│ │ ├── maria.md
│ │ ├── jose.md
│ │ └── ...
│ └── literature/ # PDF or markdown sources
│ ├── smith-2020.pdf
│ └── ...
├── verification/
│ └── verification-report.md # Single output: built up across phases; git tracks
Model Recommendations
| Task | Model | Rationale |
|---|---|---|
| Extraction (Phase 1) | Sonnet | Careful parsing of manuscript |
| Source mapping (Phase 2) | Sonnet | Matching logic |
| Fast search (Phase 3) | Grep tool | No model needed |
| Deep reading (Phase 3) | Haiku | Cost-effective document search |
| Report generation (Phase 4) | Sonnet | Clear synthesis |
Integration with Other Skills
After qual-findings-writer: Run verifier on the findings section to confirm all participant quotes are accurate.
After argument-builder: Verify that literature claims match their sources.
Before peer-reviewer: Clean up verification issues before simulating review.
Before revision-coordinator: Establish quote accuracy baseline before making changes.
Key Reminders
- Exact quotes must be exact: Even small changes should be marked with [...] or [sic]
- Source materials must be accessible: Can't verify against unavailable documents
- Participant names must be consistent: Pseudonym in manuscript must match transcript filename or label
- Deep reading is expensive but thorough: Use haiku agents when grep genuinely fails, not as first resort
- Some items need human judgment: Flag ambiguous cases rather than making calls
- Batch progress: Show verification progress to user in manageable chunks
- Prioritize quotes over paraphrases: Direct quotes are highest risk for errors
Starting the Process
When the user is ready to begin:
-
Ask for the manuscript:
"Please share the path to your manuscript (or the specific section you want verified)."
-
Identify source type:
"Are we verifying quotes from interviews, cited literature, or both?"
-
Locate sources:
"Where are your source materials? I need a folder path for interview transcripts or access to your literature (PDFs, references.bib + library/ folder, or document folder)."
-
Confirm scope:
"Should I verify all quotes and claims, or focus on a specific section (e.g., Findings only)?"
-
Proceed with Phase 0 to assess verification scope.