test-fixing
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The skill is designed to execute arbitrary shell commands including
make testanduv run pytest. Since these commands run the project's own code, any malicious code within the repository's test suite will be executed with the agent's privileges. - [PROMPT_INJECTION] (HIGH): This skill is highly susceptible to Indirect Prompt Injection (Category 8).
- Ingestion points: The agent is instructed to 'Analyze output' from
make testand 'Read relevant code' to identify root causes. - Boundary markers: None. There are no instructions to ignore natural language commands embedded within the test output or source code.
- Capability inventory: The skill has powerful capabilities including file modification (Edit tool) and shell command execution (
make,pytest). - Sanitization: None. The agent processes the untrusted output of the test runner directly to make decisions about code edits.
- [DYNAMIC_EXECUTION] (MEDIUM): The skill facilitates the execution of local code via test runners (
pytest). While standard for development, in an automated agent context, this allows for the execution of dynamically generated or modified scripts.
Recommendations
- AI detected serious security threats
Audit Metadata