auto-claude-build

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [Indirect Prompt Injection] (LOW): The system processes untrusted input in the form of specifications (spec.md) which are analyzed by the Planner and Coder agents. This creates a surface for indirect prompt injection where a malicious spec could attempt to override agent instructions or bypass the command allowlist.
  • Ingestion points: Specification files loaded via python run.py --spec SPEC.
  • Boundary markers: Not documented; no explicit mention of delimiters to separate untrusted spec content from agent instructions.
  • Capability inventory: High capability tier including Bash, Write/Edit file operations, Electron/Puppeteer for browser testing, and the ability to spawn up to 12 subagents.
  • Sanitization: The documentation mentions a "Command Allowlist" and "Sandbox", but does not detail how input data is sanitized before being interpolated into agent prompts.
  • [Command Execution] (LOW): The skill facilitates the execution of bash commands and file system modifications. While this is the intended primary purpose of a build system, the capability for an agent to execute shell commands based on potentially attacker-influenced specs remains a risk. The severity is lowered because this functionality is central to the skill's stated purpose.
  • [Dynamic Execution] (LOW): The architecture involves a Coder agent that implements subtasks and a QA Fixer that applies fixes. This represents dynamic code generation and execution. As this is the core function of the build system, it is noted as a risk surface rather than a direct threat.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:22 PM