ai-stopping-hallucinations

Pass

Audited by Gen Agent Trust Hub on May 6, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill is designed to process untrusted external data (transcripts, help documents), which represents a surface for indirect prompt injection. However, the skill specifically implements defenses against this by enforcing citations and faithfulness verification.
  • Ingestion points: SKILL.md and examples.md contain utility functions (load_vtt, load_livekit_transcript) to load external data from files into the model context.
  • Boundary markers: The skill advocates for chunking source documents into numbered passages (e.g., [1], [2]) to provide clear reference points and enable verification.
  • Capability inventory: The implementation leverages the DSPy framework for retrieval, prediction, and self-correction via assertion logic.
  • Sanitization: The modules use dspy.Assert and dspy.Suggest to programmatically reject or flag claims that are not explicitly supported by the provided context.
Audit Metadata
Risk Level
SAFE
Analyzed
May 6, 2026, 02:28 AM