skills/b-open-io/prompts/confess/Gen Agent Trust Hub

confess

Pass

Audited by Gen Agent Trust Hub on Mar 10, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill instructs the agent to run bash commands such as git diff and grep to inspect the local filesystem for changes and code references.
  • [PROMPT_INJECTION]: The skill employs psychological steering and adversarial framing ("assume something was missed", "A Skeptic is penalized for false dismissals") to influence the agent's behavior. While used for self-auditing, such coercive language can be used to bypass model constraints.
  • [PROMPT_INJECTION]: The skill creates a surface for indirect prompt injection by instructing the agent to ingest and analyze untrusted external data (file contents and git history) without explicit boundary markers or sanitization. 1. Ingestion points: Output from git diff, grep, and glob operations on the local codebase. 2. Boundary markers: The skill does not define specific delimiters to separate the agent's instructions from the potentially untrusted code content being audited. 3. Capability inventory: The agent has the ability to execute shell commands (git, grep) to read file system data. 4. Sanitization: No sanitization or filtering is applied to the content retrieved from the files before processing.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 10, 2026, 03:57 AM