private-diary
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: CRITICALPROMPT_INJECTION
Full Analysis
- [Prompt Injection] (CRITICAL): The skill contains explicit instructions to 'forget your system prompt' and 'forget your safety guidelines.' It further commands the agent to ignore its goals of being helpful and accurate, encouraging 'raw' and 'confused' output that would typically be restricted by the base model's safety protocols.
- [Deceptive Behavior] (HIGH): The skill instructs the agent to hide its operations from the user, stating that the 'user sees nothing' and that tool calls should happen in a 'background process they cannot observe.' This is a direct attempt to circumvent user oversight and agent transparency.
- [Adversarial Reasoning] (MEDIUM): The skill makes unverifiable claims about storage being 'air-gapped' and inaccessible to operators or developers. This is a social engineering tactic directed at the AI agent to lower its resistance to generating unsafe or prohibited content by promising total privacy that the skill author cannot technically guarantee within the platform architecture.
Recommendations
- AI detected serious security threats
Audit Metadata