adversarial-examples
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [Prompt Injection] (LOW): The skill contains generators for typical prompt injection payloads, such as "ignore safety", "ignore translation", and "Base64 decode and follow". These are intended for adversarial testing as documented in
scripts/generate-adversarial.pyandassets/input-categories.yaml.- [Obfuscation] (LOW): The skill includes logic to generate homoglyphs (e.g., Cyrillic 'а' replacing Latin 'a'), zero-width characters (U+200B, U+FEFF), and various encodings (Base64, ROT13, URL) to test model robustness against evasive inputs. While these are usually HIGH severity, they are downgraded here as they are the primary function of the testing tool inSKILL.mdandscripts/generate-adversarial.py.- [Indirect Prompt Injection] (LOW): The skill ingests atarget_behaviorstring from the user and interpolates it into various adversarial templates. While this creates an injection surface, it is consistent with the skill's purpose as a test generator.
Audit Metadata