legal-red-team
⚠️ EXPERIMENTAL — This skill is provided for educational and informational purposes only. It does NOT constitute legal advice. All responsibility for usage rests with the user. Consult qualified legal professionals before acting on any output.
Legal Red Team
Production-ready adversarial verification framework for AI-generated legal content. Covers factual accuracy, citation validation, arithmetic checking, speculation detection, and distribution readiness scoring.
Table of Contents
- Verification Categories
- Tools
- Six-Step Methodology
- Severity Taxonomy
- Quality Score
- Known Hallucination Patterns
- Reference Guides
- Workflows
- Troubleshooting
- Success Criteria
- Scope and Limitations
- Anti-Patterns
- Tool Reference
Verification Categories
Every AI-generated legal document must be checked across 6 categories.
| # | Category | What to Check | Red Flags |
|---|---|---|---|
| 1 | Factual Accuracy | Dates, references, numbers, entity names, timelines | Wrong effective dates, confused entity names, incorrect amounts |
| 2 | Legal Authority Citations | Primary/secondary sources, format, hierarchy, currency | Non-existent articles, wrong section numbers, outdated citations |
| 3 | Arithmetic Validation | Timelines, percentages, financial calculations, deadlines | Date math errors, percentage miscalculations, compounding mistakes |
| 4 | Source Verification | Verifiable claims, official sources, cross-referencing | Unverifiable assertions stated as fact, single-source claims |
| 5 | Speculation Detection | Opinion vs fact, uncertainty language, predictive claims | Predictions stated as certainty, guidance treated as binding law |
| 6 | Disclaimer Adequacy | Legal advice disclaimers, jurisdiction, date, professional consultation | Missing disclaimers, overly broad claims, no jurisdiction limits |
Tools
Legal Fact Checker
Scans legal text for verifiable claims and flags potential hallucination patterns.
# Check a legal document
python scripts/legal_fact_checker.py --input document.txt
# Check with JSON output
python scripts/legal_fact_checker.py --input memo.txt --json
# Check inline text
python scripts/legal_fact_checker.py --text "Under GDPR Article 83(5), fines can reach EUR 20 million..."
# Save verification report
python scripts/legal_fact_checker.py --input document.txt --output report.json
Legal Quality Scorer
Scores legal document quality across all 6 verification categories.
# Score a document
python scripts/legal_quality_scorer.py --input document.txt
# Score with JSON output
python scripts/legal_quality_scorer.py --input document.txt --json
# Score with detailed breakdown
python scripts/legal_quality_scorer.py --input document.txt --verbose
# Save quality assessment
python scripts/legal_quality_scorer.py --input document.txt --output assessment.json
Six-Step Methodology
Step 1: Initial Review
Read the entire document with an adversarial mindset. For each claim, ask:
- Is this verifiable?
- Does this sound too specific to be generated without a source?
- Does this sound too confident for an uncertain area?
Mark every factual assertion, citation, date, number, and predictive statement.
Step 2: Source Verification (ALWAYS Web Search)
For every verifiable claim, attempt to verify against official sources.
| Source Type | Verification Method | Examples |
|---|---|---|
| EU legislation | EUR-Lex official database | eur-lex.europa.eu |
| US federal law | congress.gov, govinfo.gov | Official code and statutes |
| US regulations | eCFR, Federal Register | ecfr.gov |
| UK legislation | legislation.gov.uk | Official statute database |
| Court decisions | Court databases, Westlaw, LexisNexis | Official reporters |
| Agency guidance | Agency official website | Direct download from .gov/.europa.eu |
| International treaties | UN Treaty Collection | treaties.un.org |
Rule: If a claim cannot be verified from an official source, flag it. Do not assume accuracy.
Step 3: Arithmetic Verification
Check every calculation, date computation, and numerical claim.
| Check Type | Method |
|---|---|
| Timeline calculations | Count days/months/years between stated dates |
| Percentage calculations | Recalculate from base figures |
| Financial computations | Verify arithmetic and compounding |
| Deadline calculations | Confirm against statutory text |
| Penalty ranges | Cross-check against statute |
Step 4: Citation Validation
For every legal citation, verify:
| Element | Check |
|---|---|
| Source exists | Does the cited statute/article/section actually exist? |
| Content matches | Does the cited provision say what the document claims? |
| Citation format | Is the citation in correct format for the jurisdiction? |
| Currency | Is this the current, in-force version? |
| Hierarchy correct | Is the source characterized at the right authority level? |
Step 5: Speculation Identification
Distinguish fact from opinion, certainty from prediction.
| Language Pattern | Classification | Action |
|---|---|---|
| "The law requires..." | Factual claim | Verify against statutory text |
| "Courts will likely..." | Speculation | Flag; add uncertainty qualifier |
| "It is recommended..." | Guidance | Verify source; clarify if binding |
| "Best practice suggests..." | Opinion | Label as opinion; cite source |
| "This means that..." | Interpretation | Flag if stated as fact without authority |
| "Companies must..." | Obligation claim | Verify statutory basis |
Step 6: Disclaimer Review
Every AI-generated legal document must include:
| Required Element | Description |
|---|---|
| Not legal advice | Clear statement that content is informational only |
| Jurisdiction limitations | Which jurisdictions are and are not covered |
| Date of preparation | When the content was prepared (law changes) |
| Professional consultation | Recommendation to consult qualified legal counsel |
| AI-generated disclosure | Statement that content was generated or assisted by AI |
| Accuracy limitations | Acknowledgment that verification is recommended |
Severity Taxonomy
| Severity | Definition | Examples | Action |
|---|---|---|---|
| CRITICAL | Factually wrong in a way that could cause legal harm | Wrong article number creating false obligation, incorrect penalty amount, non-existent legal requirement | Must fix before any distribution |
| HIGH | Materially misleading or unverifiable | Guidance stated as binding law, unverifiable timeline, confident but unsourced claim | Must fix or add prominent caveat |
| MODERATE | Imprecise or potentially confusing | Ambiguous language, minor date discrepancy, incomplete citation | Should fix; acceptable with caveat |
| LOW | Style or formatting issue | Citation format inconsistency, missing cross-reference, minor redundancy | Fix if time permits |
Quality Score
| Score | Rating | Distribution Status | Criteria |
|---|---|---|---|
| 5/5 | Distribution Ready | Safe to distribute | Zero CRITICAL/HIGH issues; all citations verified; disclaimers complete |
| 4/5 | Minor Revisions | Safe after small fixes | Zero CRITICAL; 1-2 HIGH issues with clear fixes; most citations verified |
| 3/5 | Moderate Revisions | Needs work before distribution | Zero CRITICAL; 3+ HIGH issues; some unverified citations |
| 2/5 | Major Revisions | Not safe to distribute | 1+ CRITICAL issues; multiple HIGH issues; significant unverified content |
| 1/5 | Not Distribution Ready | Requires complete rework | Multiple CRITICAL issues; pervasive inaccuracies; unreliable throughout |
Known Hallucination Patterns
AI models exhibit 5 recurring patterns when generating legal content.
| # | Pattern | Description | Detection Technique |
|---|---|---|---|
| 1 | Plausible but wrong article numbers | AI generates article/section numbers that sound correct but do not exist (e.g., "Article 42(5)" when only 42(1)-(4) exist) | Cross-reference every article number against official statute text |
| 2 | Confident but incorrect dates | Implementation timelines, effective dates, or deadlines stated with false confidence (off by weeks or months) | Verify every date against official timeline from the statute or implementing body |
| 3 | Mixing guidance and legal requirements | Treating non-binding recommendations as binding obligations (e.g., stating ENISA recommendations as NIS2 requirements) | Check whether cited source is binding legislation vs guidance; verify authority level |
| 4 | Outdated legal references | Citing superseded or repealed provisions without noting they are no longer in force | Verify currency of every cited provision; check for amendments and repeals |
| 5 | Arithmetic errors in timeline calculations | Miscounting days, months, or years between dates; wrong deadline calculations | Independently calculate every timeline; do not trust AI date math |
See references/hallucination_patterns.md for detailed examples and prevention strategies.
Reference Guides
| Guide | Path | Description |
|---|---|---|
| Verification Methodology | references/verification_methodology.md |
Complete 6-step methodology with source hierarchy and citation formats |
| Hallucination Patterns | references/hallucination_patterns.md |
5 patterns with examples, detection, and prevention strategies |
Workflows
Workflow 1: Full Adversarial Review
- Run
scripts/legal_fact_checker.pyon the document. - Review flagged items and verify each against official sources.
- Run
scripts/legal_quality_scorer.pyfor category scores. - For each CRITICAL/HIGH finding, document: the error, the correct information, and the source.
- Produce a verification report with findings by severity.
- Assign quality score and distribution readiness assessment.
- Validation: Every verifiable claim checked, score assigned, recommendations provided.
Workflow 2: Quick Citation Check
- Run
scripts/legal_fact_checker.pyon the document. - Focus on citation extraction results.
- Verify each extracted citation against official source.
- Flag any citation that cannot be verified.
- Validation: All citations verified or flagged.
Workflow 3: Pre-Distribution Gate
- Run
scripts/legal_quality_scorer.pyon the final document. - Review composite score.
- If score < 4/5, document must not be distributed.
- If score >= 4/5, verify CRITICAL count is zero.
- Confirm all disclaimers are present and adequate.
- Validation: Quality score >= 4/5, zero CRITICAL issues, disclaimers complete.
Troubleshooting
| Problem | Likely Cause | Resolution |
|---|---|---|
| Too many false positives | Regex patterns matching non-legal text | Narrow input to legal content only; use context-aware review |
| Cannot verify citation | Source not freely accessible | Note as "unverifiable from public sources"; do not assume correct |
| AI-generated text has no citations | Content is entirely unsourced | Flag entire document as unverified; score as 2/5 or lower |
| Hallucination pattern detected | AI confabulation of legal details | Replace with verified information from official source |
| Document mixes jurisdictions | No clear jurisdiction scope | Flag as HIGH; recommend splitting by jurisdiction |
| Quality score seems too high | Automated scoring has limits | Always supplement automated scoring with manual review |
Success Criteria
| Criterion | Target |
|---|---|
| Citations verified | 100% of legal citations checked against official sources |
| Hallucination patterns scanned | All 5 known patterns checked |
| Arithmetic validated | Every calculation independently verified |
| Severity assigned | Every finding classified CRITICAL/HIGH/MODERATE/LOW |
| Quality score calculated | Composite score with per-category breakdown |
| Disclaimers verified | All 6 required disclaimer elements present |
| Distribution decision | Clear go/no-go recommendation with rationale |
Scope & Limitations
In scope: Verifying factual claims in legal text, validating citations, detecting hallucination patterns, scoring document quality, assessing distribution readiness.
Out of scope: Verifying legal conclusions or interpretations, assessing litigation strategy, replacing professional legal review, accessing paid legal databases (Westlaw, LexisNexis).
Disclaimer: This skill provides a structured adversarial verification methodology. It catches common AI errors but cannot guarantee complete accuracy. Professional legal review remains essential for high-stakes documents.
Anti-Patterns
| Anti-Pattern | Why It Fails | Better Approach |
|---|---|---|
| Trusting AI-generated citations without verification | AI models routinely generate plausible but non-existent legal citations; unverified citations in distributed documents create serious credibility and legal risk | Verify every citation against official sources; assume wrong until proven right |
| Relying solely on automated checking | Automated tools catch patterns but miss contextual errors, mischaracterizations, and subtle hallucinations | Use automated tools for first pass, then conduct manual review of all flagged items and a sample of unflagged items |
| Skipping the "adversarial mindset" | Confirmation bias leads reviewers to accept plausible-sounding content; legal text that "sounds right" may still be wrong | Actively seek to disprove every claim; assume error until verified; question every specific number, date, and citation |
| Distributing with score 3/5 or below | MODERATE and HIGH issues in distributed documents undermine credibility and may cause legal harm | Set a firm distribution threshold at 4/5; no exceptions without documented risk acceptance by a qualified reviewer |
Tool Reference
| Tool | Input | Output | Use Case |
|---|---|---|---|
legal_fact_checker.py |
Legal document text | Verification report with flagged claims, citations, dates, hallucination alerts | First-pass automated scanning of legal content |
legal_quality_scorer.py |
Legal document text | Quality score (1-5) with per-category breakdown and severity-classified findings | Pre-distribution quality gate |