approval-tests

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTION
Full Analysis
  • COMMAND_EXECUTION (MEDIUM): The Python API documents verify_command_line in references/python/api.md, which executes shell commands. This is a potential vector for command injection if untrusted data is processed by the agent.
  • REMOTE_CODE_EXECUTION (MEDIUM): The 'Inline Approvals' feature (documented in references/python/inline.md and references/java/inline.md) automatically modifies and rewrites source code files based on program output. This creates a self-modifying code surface where untrusted output could inject malicious code into the test suite.
  • EXTERNAL_DOWNLOADS (MEDIUM): The skill instructs users to install packages from public registries and GitHub repositories under the approvals organization. As this organization is not on the pre-approved trusted list, these dependencies are considered unverifiable in high-security contexts.
  • COMMAND_EXECUTION (MEDIUM): Various reporters (e.g., GenericDiffReporter in references/python/reporters.md) can be configured to execute arbitrary binary paths on the host system to perform file diffing.
  • PROMPT_INJECTION (LOW): The skill possesses a surface for indirect prompt injection. 1. Ingestion points: Data enters the agent context through verify() and related functions in api.md files for Python, Java, and Node.js. 2. Boundary markers: None identified; verification data is handled as raw strings or objects. 3. Capability inventory: File system writes (approval files), source code rewriting (inline approvals), and shell command execution (verify_command_line). 4. Sanitization: Includes 'scrubbers' (e.g., references/python/scrubbers.md) which provide output normalization for dates and GUIDs but do not prevent malicious instruction injection.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 06:47 PM