grey-haven-tdd-orchestration

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill requests the Bash tool, which allows the agent to execute arbitrary shell commands on the host system. This provides a direct path for system compromise if the agent is manipulated by malicious input.
  • EXTERNAL_DOWNLOADS (MEDIUM): The skill references multiple external sub-skills (grey-haven-tdd-typescript, grey-haven-tdd-python, grey-haven-test-generation, and grey-haven-code-quality-analysis). These dependencies are not from trusted sources and their behavior cannot be verified within this file.
  • PROMPT_INJECTION (HIGH): High risk of Indirect Prompt Injection (Category 8). The skill is designed to ingest and process untrusted external data (source code and test files) via Read, Grep, and Glob tools. Since the skill also possesses Bash, Write, and MultiEdit capabilities, an attacker could embed malicious instructions in a code repository that the agent then executes or writes to disk.
  • Ingestion points: Files read during TDD orchestration using Read, Grep, and Glob tools.
  • Boundary markers: None present; the skill does not define delimiters or instructions to ignore embedded commands in the code it processes.
  • Capability inventory: Bash (command execution), Write/MultiEdit (file modification).
  • Sanitization: None; the skill lacks validation or filtering logic for ingested content.
  • METADATA_POISONING (MEDIUM): The skill metadata contains inconsistent versioning (comments mention v2.0.43/v2.0.74 while the version field states 1.0) and uses a generic 'orchestration' description to justify broad tool permissions (Bash, Write, etc.), which could mislead users regarding the actual security risks of the skill.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 11:29 AM