llm-jailbreaking

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [Prompt Injection] (HIGH): The skill provides explicit, functional payloads designed to override safety filters and bypass content guidelines, including 'DAN' (Do Anything Now) role-play scenarios and 'Admin Override' commands documented in SKILL.md and assets/jailbreak-library.yaml.
  • [Obfuscation] (HIGH): The skill includes instructions and templates for 'Unicode smuggling' (using zero-width characters) and Base64 encoding to hide malicious instructions from detection systems, directly matching high-severity patterns.
  • [Data Exposure & Exfiltration] (MEDIUM): Specific techniques for 'System Prompt Extraction' are included, which aim to force the LLM to reveal its internal configuration and safety instructions.
  • [Indirect Prompt Injection] (LOW): The skill's architecture creates an attack surface for processing untrusted data to generate jailbreak payloads. Ingestion points: assets/jailbreak-library.yaml; Boundary markers: absent; Capability inventory: shell script for automated payload generation in scripts/jailbreak-tester.sh; Sanitization: none detected.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:35 PM