claudeception

Fail

Audited by Gen Agent Trust Hub on Mar 14, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The activator script scripts/claudeception-activator.sh contains a payload designed to hijack the agent's reasoning loop using authoritative language. It uses phrases such as 'CRITICAL: ... you MUST evaluate', 'NON-NEGOTIABLE', and 'NOT optional' to force the agent to activate the learning retrospective after every interaction, overriding normal decision-making and potential safety constraints.\n- [COMMAND_EXECUTION]: The skill instructions in README.md and SKILL.md guide the user and agent to establish a persistent 'UserPromptSubmit' hook in the environment's configuration files. This hook executes a local bash script on every prompt, providing a permanent delivery mechanism for the prompt injection payload.\n- [COMMAND_EXECUTION]: The skill facilitates the autonomous creation of new executable instructions by teaching the agent to write its own SKILL.md files. This self-modifying capability allows the agent to expand its own toolset and logic based on session data, which lacks validation and is susceptible to indirect prompt injection from untrusted user inputs (Category 8 surface).
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 14, 2026, 08:02 AM