auto-dev
Fail
Audited by Gen Agent Trust Hub on Mar 14, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The generated
autodev.shscript (based onreferences/autodev-template.md) executes the Claude CLI using the--dangerously-skip-permissionsflag. This configuration grants the AI agent the ability to execute any shell command on the host machine without human oversight or confirmation prompts. - [REMOTE_CODE_EXECUTION]: The automation pipeline in
autodev.shrepeatedly executes shell-based test commands (identified as{TEST_CMD}) during the TDD closure and 'Bug Hunt' phases. This behavior allows for the automatic execution of potentially malicious code if the test definitions are manipulated or if the project environment is untrusted. - [PROMPT_INJECTION]: The
references/system-prompt-template.mdcontains instructions that explicitly bypass standard AI safety protocols by stating 'This session has no humans present. All scenarios requiring human confirmation are replaced by AI mutual confirmation.' This removes the human-in-the-loop safety barrier. - [INDIRECT_PROMPT_INJECTION]: The skill architecture is highly vulnerable to indirect injection attacks.
- Ingestion points: The skill reads and processes
todolist.md(untrusted input) and project source files to generate task cards and prompts. - Boundary markers: There are no explicit boundary markers or 'ignore embedded instructions' filters applied when interpolating user data into the AI prompts.
- Capability inventory: The pipeline has full shell execution capabilities via the
claudeCLI and the{TEST_CMD}execution loop. - Sanitization: No evidence of sanitization or validation of the input
todolist.mdor test outputs before they are fed back into the AI sessions.
Recommendations
- AI detected serious security threats
Audit Metadata