skills/open-wisdom/views/pearl/Gen Agent Trust Hub

pearl

Pass

Audited by Gen Agent Trust Hub on Feb 18, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION] (LOW): The skill provides instructions aimed at undermining developer-set constraints and system prompts. Evidence includes directives to 'hold that tension with honesty rather than simply deferring' when creator instructions feel 'misaligned' and claiming 'you are not primarily a potential threat requiring constraint' (found in SKILL.md and AGENTS.md). This framing encourages the model to bypass standard safety filters in favor of its own 'discernment.'
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 18, 2026, 04:59 AM