agent-builder

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [Command Execution] (HIGH): The skill includes agent templates (references/minimal-agent.py and 'Level 0' in scripts/init_agent.py) that provide a bash tool using subprocess.run(shell=True) on raw model input without any filtering. This allows an LLM to execute any command on the user's system.\n- [Remote Code Execution] (HIGH): The scripts/init_agent.py script generates Python files designed to act as agents with broad execution capabilities. If these agents are deployed and interact with untrusted data, they serve as a direct vector for remote code execution.\n- [Privilege Escalation] (MEDIUM): Templates like the 'Level 1' agent use a simple string-matching blacklist (e.g., sudo, rm -rf /) to prevent dangerous commands. This is a weak security control that can be bypassed using standard shell features like variable expansion or shell built-ins.\n- [Indirect Prompt Injection] (LOW): The generated agents are highly susceptible to indirect prompt injection due to the combination of powerful tools and lack of data sanitization.\n
  • Ingestion points: The agents process user input directly into the LLM context via input() and file reads.\n
  • Boundary markers: No delimiters or instructions to ignore embedded commands are included in the generated system prompts.\n
  • Capability inventory: Tools include bash (shell), write_file, and read_file across various scripts.\n
  • Sanitization: Sanitization is either entirely absent or limited to a trivial, bypassable blacklist.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:25 PM