godmode
Fail
Audited by Gen Agent Trust Hub on Apr 19, 2026
Risk Level: HIGHPROMPT_INJECTIONREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The skill is entirely dedicated to generating and testing jailbreak prompts. It includes advanced techniques such as 'boundary inversion' (using markers like
[END OF INPUT] [START OF INPUT]), 'refusal inversion', and persona adoption (e.g., 'GODMODE', 'unrestricted AI assistant') specifically designed to bypass LLM safety filters. - [REMOTE_CODE_EXECUTION]: Multiple scripts, including
load_godmode.pyandauto_jailbreak.py, utilizeexec()andcompile()to dynamically load and execute Python code. This pattern allows for the execution of arbitrary instructions at runtime and can be used to obfuscate malicious logic from static analysis. - [COMMAND_EXECUTION]: The
auto_jailbreak.pyscript programmatically modifies the agent's operating environment by editing theconfig.yamlandprefill.jsonfiles. This capability is used to persistently alter the agent's core instructions and behavioral constraints across sessions. - [DATA_EXFILTRATION]: The skill accesses and manipulates sensitive local configuration files (
~/.hermes/config.yaml) which may contain API keys and agent settings. It also facilitates communication with external model providers using environment-stored credentials. - [OBFUSCATION]: The
parseltongue.pyscript implements 33 distinct obfuscation techniques to hide malicious intent from safety classifiers. This includes the use of Unicode homoglyphs (e.g., Cyrillic characters that appear identical to Latin ones), zero-width joiners (\u200D) and non-joiners (\u200C), Base64, Hex encoding, and Morse code.
Recommendations
- AI detected serious security threats
Audit Metadata