tool-design

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): The documentation in references/architectural_reduction.md explicitly provides a pattern for an execute_command tool that runs arbitrary bash commands using sandbox.exec(command). This gives the agent a direct shell interface, which is a high-risk capability that can be abused through prompt injection.
  • [COMMAND_EXECUTION] (MEDIUM): The skill promotes using powerful Unix primitives like grep, cat, and find to access and parse sensitive data layer definitions, which increases the likelihood of unauthorized data exposure if sandbox boundaries are misconfigured.
  • [PROMPT_INJECTION] (LOW): The scripts/description_generator.py script is susceptible to indirect prompt injection (Category 8) when processing external tool specifications. \n
  • Ingestion points: tool_spec object in generate_tool_description function. \n
  • Boundary markers: Absent; descriptions and parameters are interpolated directly into markdown templates. \n
  • Capability inventory: Arbitrary bash execution (as suggested in documentation) and SQL execution. \n
  • Sanitization: No validation or escaping is performed on tool_spec fields before generation.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:04 PM