test-driven-development
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The skill mandates the execution of shell commands such as 'npm test' in the 'Verify RED' and 'Verify GREEN' steps. This triggers the execution of scripts defined in the project's 'package.json' or within the test files themselves, allowing for arbitrary local command execution.
- REMOTE_CODE_EXECUTION (HIGH): In the context of an AI agent processing an external repository (e.g., from a 'git clone'), this skill functions as an RCE vector. A malicious repository could embed instructions or scripts in its test suite that execute during the mandatory verification steps.
- INDIRECT_PROMPT_INJECTION (HIGH): The skill lacks boundary markers or sanitization when processing external data (the code and tests). It has high-privilege capabilities (execution and file writing). Evidence: 1. Ingestion: File paths provided to 'npm test'. 2. Capability: Subprocess execution via 'npm test' and file modification for 'GREEN' phase. 3. Boundary markers: Absent. 4. Sanitization: Absent.
Recommendations
- AI detected serious security threats
Audit Metadata