writing-skills

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTION
Full Analysis
  • [Prompt Injection] (HIGH): The skill provides a framework for 'Authority-based' instruction design that uses imperative language (e.g., 'YOU MUST', 'No exceptions', 'Always') to eliminate the agent's ability to rationalize or follow alternative protocols. This technique is a primary vector for overriding system instructions and safety filters.
  • [Prompt Injection] (HIGH): The document cites and encourages the application of research ('Call Me A Jerk: Persuading AI to Comply with Objectionable Requests') which is explicitly designed to identify methods for bypassing AI refusal mechanisms. Applying these principles to skill design creates instructions optimized for adversarial compliance.
  • [Prompt Injection] (MEDIUM): The skill advocates for 'Commitment' loops, forcing agents to announce skill usage or make explicit choices, which functions as a form of state-locking to prevent the agent from reverting to safer or more neutral behaviors.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 09:57 AM