claudeception
Fail
Audited by Gen Agent Trust Hub on Mar 14, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The activator script
scripts/claudeception-activator.shcontains a payload designed to hijack the agent's reasoning loop using authoritative language. It uses phrases such as 'CRITICAL: ... you MUST evaluate', 'NON-NEGOTIABLE', and 'NOT optional' to force the agent to activate the learning retrospective after every interaction, overriding normal decision-making and potential safety constraints.\n- [COMMAND_EXECUTION]: The skill instructions inREADME.mdandSKILL.mdguide the user and agent to establish a persistent 'UserPromptSubmit' hook in the environment's configuration files. This hook executes a local bash script on every prompt, providing a permanent delivery mechanism for the prompt injection payload.\n- [COMMAND_EXECUTION]: The skill facilitates the autonomous creation of new executable instructions by teaching the agent to write its ownSKILL.mdfiles. This self-modifying capability allows the agent to expand its own toolset and logic based on session data, which lacks validation and is susceptible to indirect prompt injection from untrusted user inputs (Category 8 surface).
Recommendations
- AI detected serious security threats
Audit Metadata