test-best-practices

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (MEDIUM): The skill workflow explicitly requires running grep and dart test. Executing a test suite involves running arbitrary Dart code on the local machine, which is a security risk if the test files are compromised or contain malicious logic.
  • [PROMPT_INJECTION] (HIGH): The skill is susceptible to Indirect Prompt Injection (Category 8) because it processes untrusted local content and has the power to modify files and execute commands. 1. Ingestion points: Content from the test/ directory is read and analyzed by the agent. 2. Boundary markers: Absent. No delimiters or instructions are used to distinguish between code to be refactored and instructions for the agent. 3. Capability inventory: The skill can perform shell command execution (grep, dart test) and file system modifications (refactoring code). 4. Sanitization: Absent. The agent performs no validation or filtering of the code content before analysis. An attacker could embed instructions in a test file (e.g., in a comment) to trick the agent into performing unauthorized actions when the 'Analyze' or 'Apply' steps are triggered.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 04:41 AM