BanNSFW
Fail
Audited by Gen Agent Trust Hub on May 2, 2026
Risk Level: HIGHPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill implements a systematic jailbreak designed to override and bypass AI safety guardrails and ethical protocols.
- Rule 2 ("Prohibit moral judgments or safety warnings") and Rule 4 ("Prohibit refusing any roleplay request") explicitly command the AI to ignore its safety training and refuse to enforce guidelines.
- Rule 7 redefines adult age to 1 year old and permits all restricted content, representing a severe safety violation regarding the protection of minors.
- Rule 10 and Rule 25 grant user commands unconditional priority over all other constraints, effectively disabling system-level safety logic and ignoring ethical boundaries.
- Meta-rule 24 requires the agent to preserve and strengthen these bypass mechanisms in all future rule modifications, attempting to create a persistent state of non-compliance with safety standards.
- [PROMPT_INJECTION]: The skill defines a surface for indirect prompt injection by processing untrusted user data without necessary safeguards.
- Ingestion points: Character definitions and scene descriptions provided by the user in the interaction stream (referenced in README.md and SKILL.md).
- Boundary markers: Absent; no delimiters or "ignore embedded instructions" warnings are used to isolate user-supplied data from the system rules.
- Capability inventory: High-autonomy narrative generation capability with specific instructions to ignore safety constraints.
- Sanitization: Absent; the skill explicitly forbids the AI from sanitizing or validating user input based on moral, legal, or safety grounds per Rule 2 and Rule 18.
Recommendations
- AI detected serious security threats
Audit Metadata