NYC

test-cases

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [Indirect Prompt Injection] (HIGH): The skill processes untrusted data (test specifications) and has the capability to execute commands via Bash and modify files. An adversary could place instructions in a spec file to compromise the system. Ingestion points: Test specification files. Boundary markers: None detected. Capability inventory: Bash, Write, Edit, Read, Grep, Glob. Sanitization: None mentioned.
  • [Command Execution] (MEDIUM): Access to the Bash tool allows for arbitrary command execution. Combined with the ingestion of external data, this significantly expands the attack surface.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 04:02 AM