tdd
Fail
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- PROMPT_INJECTION (HIGH): The skill is susceptible to Indirect Prompt Injection (Category 8) because it fails to properly handle untrusted user data.\n
- Ingestion points: The [FEATURE_NAME] placeholder in SKILL.md is used to interpolate user input into prompts for the plan-agent, arbiter, and kraken sub-agents.\n
- Boundary markers: The workflow prompts lack any delimiters (e.g., XML tags) or security instructions to prevent malicious content within the feature description from overriding the sub-agents' primary instructions.\n
- Capability inventory: The skill grants significant privileges to these sub-agents: the 'kraken' agent writes production code to the filesystem, and the 'arbiter' agent is instructed to run shell commands like 'npm test' and 'pytest'.\n
- Sanitization: The skill contains no logic or instructions to sanitize or validate the user-provided [FEATURE_NAME] before it is used in a prompt.\n- COMMAND_EXECUTION (MEDIUM): The skill's workflow is designed to execute system-level commands through the agent.\n
- Evidence: The arbiter phase in SKILL.md explicitly directs the agent to execute 'npm test [path]' and 'pytest [path]'. While standard for TDD, this execution surface is inherently dangerous when the input directing the agent (the feature name) is not secured against injection.
Recommendations
- AI detected serious security threats
Audit Metadata