judge

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION] (LOW): The skill is vulnerable to indirect prompt injection (Category 8) as it processes untrusted user data within its evaluation prompts.
  • Ingestion points: The {{task_output}} variable in references/prompt-deep.md and references/prompt-quick.md is populated with external, untrusted content.
  • Boundary markers: Absent. The templates use simple text headers (e.g., 'OUTPUT:') which do not provide strong isolation against adversarial instructions hidden within the task output.
  • Capability inventory: The runner script (run-judge.sh) executes shell commands, writes to a persistent log file (verdicts.jsonl), and facilitates model interactions.
  • Sanitization: Not present. Documentation indicates that while model outputs are cleaned for JSON parsing, the untrusted inputs are not sanitized or escaped before interpolation.
  • [COMMAND_EXECUTION] (LOW): The skill relies on the execution of local shell scripts and external CLI tools to perform its logic.
  • Evidence: References to automation/judge/run-judge.sh, automation/judge/setup-judge-kg.sh, and automation/judge/pre-push-judge.sh for orchestration and setup.
  • External Dependencies: Mentions a dependency on terraphim-cli for knowledge graph-based enrichment, which is an external tool not included in the trusted source list.
  • Safety Note: The skill documentation notes a 'Fail-Open' design and the use of temporary files to mitigate shell escaping issues, showing awareness of basic command injection risks.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 07:34 PM