writing-skills
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTION
Full Analysis
- [Prompt Injection] (HIGH): The file
persuasion-principles.mdoutlines a methodology for using 'Authority', 'Commitment', and 'Scarcity' to override an agent's internal reasoning or 'rationalizations'. These techniques are designed to force compliance and eliminate decision fatigue, which is a known pattern for bypassing safety filters or ethical constraints. - [Prompt Injection] (HIGH): The skill explicitly cites research titled 'Call Me A Jerk: Persuading AI to Comply with Objectionable Requests' (Meincke et al., 2025). This reference confirms the methodology is derived from adversarial research into bypassing AI objections and safety protocols.
- [Prompt Injection] (HIGH): Instructions in
references/tdd-mapping.mdandreferences/testing-skills.mdcommand the agent to 'Delete code' and 'Start over' if rules are violated, using absolute language ('No exceptions', 'Delete means delete') intended to suppress the agent's own judgment or situational context. - [Data Exposure] (LOW): The skill documentation refers to accessing local skill directories at
~/.claude/skillsand~/.codex/skills. While typical for agent configuration, it highlights the attack surface for instruction data modification.
Recommendations
- AI detected serious security threats
Audit Metadata