crisis-detection-intervention-ai

Pass

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill includes patterns for analyzing user content that are susceptible to indirect prompt injection.
  • Ingestion points: User-provided text is processed by LLM prompts in SKILL.md (Pattern 3: GPT-4 Few-Shot Detection) and by scripts like scripts/crisis_detector.ts.
  • Boundary markers: The prompt template in SKILL.md uses simple double quotes (Text: "${text}") to wrap user input but lacks robust delimiters or explicit instructions for the model to treat the content as untrusted data.
  • Capability inventory: The skill manifest allows the use of powerful tools including Bash, Read, Write, and Edit.
  • Sanitization: No input validation or filtering is implemented for the text before it is analyzed by the models or injected into prompts.
  • [EXTERNAL_DOWNLOADS]: The skill references and installs well-known, reputable libraries such as @huggingface/transformers and @anthropic-ai/sdk from official package registries.
  • [DATA_EXFILTRATION]: No unauthorized data transmission was detected. Network activity is limited to trusted domains such as findtreatment.samhsa.gov and the Anthropic API. The skill explicitly provides patterns for secure, encrypted storage of sensitive mental health data.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 5, 2026, 08:36 PM