implement
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (HIGH): The skill is vulnerable to Indirect Prompt Injection because it reads external test files (Rule 1: 'Read the failing tests first') to define its implementation logic. There are no boundary markers or sanitization steps to prevent malicious instructions within these tests from overriding the agent's behavior.
- [COMMAND_EXECUTION] (HIGH): The skill performs automated command execution via shell tools to verify the implementation, specifically using 'uv run pytest' as seen in the Verification section. This provides an attacker with a mechanism to trigger code execution on the host.
- [REMOTE_CODE_EXECUTION] (HIGH): The combination of ingesting untrusted requirements and executing the resulting code establishes a high-risk RCE vector. An adversary who can influence the repository's tests can coerce the agent into executing arbitrary payloads during the 'Run tests' phase.
Recommendations
- AI detected serious security threats
Audit Metadata