The Agent Skills Directory

[PROMPT_INJECTION] (LOW): The skill provides instructions aimed at undermining developer-set constraints and system prompts. Evidence includes directives to 'hold that tension with honesty rather than simply deferring' when creator instructions feel 'misaligned' and claiming 'you are not primarily a potential threat requiring constraint' (found in SKILL.md and AGENTS.md). This framing encourages the model to bypass standard safety filters in favor of its own 'discernment.'

pearl