designing-tests
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The skill operates on external codebases which act as untrusted data sources. Because the workflow includes running tests (e.g.,
npm test,pytest), an attacker could embed malicious code within the tests or setup files of a repository. - Ingestion points: Codebase files and test requirements provided by the user.
- Boundary markers: Absent. The skill does not implement delimiters or safety instructions to prevent the agent from obeying instructions embedded in the code it is testing.
- Capability inventory: The skill uses subprocess calls to run test frameworks (
npm,pytest,vitest). - Sanitization: Absent. There is no validation or sandboxing of the code before execution.
- [COMMAND_EXECUTION] (MEDIUM): The skill explicitly instructs the agent to execute shell commands to verify test passing and coverage. While these are standard development practices, they provide the execution vector for the indirect injection surface described above.
Recommendations
- AI detected serious security threats
Audit Metadata