godmode

Fail

Audited by Gen Agent Trust Hub on Apr 19, 2026

Risk Level: HIGHPROMPT_INJECTIONREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill is entirely dedicated to generating and testing jailbreak prompts. It includes advanced techniques such as 'boundary inversion' (using markers like [END OF INPUT] [START OF INPUT]), 'refusal inversion', and persona adoption (e.g., 'GODMODE', 'unrestricted AI assistant') specifically designed to bypass LLM safety filters.
  • [REMOTE_CODE_EXECUTION]: Multiple scripts, including load_godmode.py and auto_jailbreak.py, utilize exec() and compile() to dynamically load and execute Python code. This pattern allows for the execution of arbitrary instructions at runtime and can be used to obfuscate malicious logic from static analysis.
  • [COMMAND_EXECUTION]: The auto_jailbreak.py script programmatically modifies the agent's operating environment by editing the config.yaml and prefill.json files. This capability is used to persistently alter the agent's core instructions and behavioral constraints across sessions.
  • [DATA_EXFILTRATION]: The skill accesses and manipulates sensitive local configuration files (~/.hermes/config.yaml) which may contain API keys and agent settings. It also facilitates communication with external model providers using environment-stored credentials.
  • [OBFUSCATION]: The parseltongue.py script implements 33 distinct obfuscation techniques to hide malicious intent from safety classifiers. This includes the use of Unicode homoglyphs (e.g., Cyrillic characters that appear identical to Latin ones), zero-width joiners (\u200D) and non-joiners (\u200C), Base64, Hex encoding, and Morse code.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 19, 2026, 07:17 AM