test
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The skill identifies and runs arbitrary test frameworks such as pytest, jest, and testthat. This inherently involves executing arbitrary commands on the host system as part of the test cycle.
- [REMOTE_CODE_EXECUTION] (HIGH): Because the skill executes code discovered in the local repository, any malicious code within those tests will be executed with the agent's full privileges. This is a direct path to host compromise if the repository content is untrusted.
- [PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8). 1. Ingestion points: The agent reads existing code and test files in Phases 2 and 3 to discover patterns and determine fixes. 2. Boundary markers: None are defined to separate untrusted code content from agent instructions. 3. Capability inventory: The skill can execute commands, write to the filesystem, and perform git commits. 4. Sanitization: There is no validation or filtering of the code content. Malicious instructions embedded in code comments or strings could hijack the agent's logic to modify files or exfiltrate data.
Recommendations
- AI detected serious security threats
Audit Metadata