confess
Pass
Audited by Gen Agent Trust Hub on Mar 10, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill instructs the agent to run bash commands such as
git diffandgrepto inspect the local filesystem for changes and code references. - [PROMPT_INJECTION]: The skill employs psychological steering and adversarial framing ("assume something was missed", "A Skeptic is penalized for false dismissals") to influence the agent's behavior. While used for self-auditing, such coercive language can be used to bypass model constraints.
- [PROMPT_INJECTION]: The skill creates a surface for indirect prompt injection by instructing the agent to ingest and analyze untrusted external data (file contents and git history) without explicit boundary markers or sanitization. 1. Ingestion points: Output from
git diff,grep, andgloboperations on the local codebase. 2. Boundary markers: The skill does not define specific delimiters to separate the agent's instructions from the potentially untrusted code content being audited. 3. Capability inventory: The agent has the ability to execute shell commands (git,grep) to read file system data. 4. Sanitization: No sanitization or filtering is applied to the content retrieved from the files before processing.
Audit Metadata