judge-with-debate

Pass

Audited by Gen Agent Trust Hub on Apr 23, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8) due to the way it processes untrusted content.
  • Ingestion points: The contents of external solution files and user-provided task descriptions are interpolated directly into prompts for the 'Meta-Judge' and 'Judge' agents in SKILL.md.
  • Boundary markers: The prompt templates use Markdown headers (e.g., ## Solution, ## Task Description) to separate untrusted data from instructions, but they lack explicit directives to ignore or escape any instructions embedded within that data.
  • Capability inventory: The agents involved have the ability to read and write to the filesystem (specifically .specs/reports/) and to recursively dispatch additional sub-agents via the Task tool.
  • Sanitization: The skill does not perform any validation, escaping, or filtering of the solution file content before passing it to the sub-agents.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 23, 2026, 03:49 AM