agent-eval-harness
Warn
Audited by Snyk on Feb 28, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.90). The SKILL.md shows the harness executes prompts that can invoke external web/tool calls (the capture output example includes a "WebSearch" tool_call) and also documents --simple/--shell modes and LLM-as-judge graders that run external commands or fetch content, so it can ingest untrusted public web/user-generated content that may influence agent decisions.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 1.00). The Dockerfile and docs contain a direct remote-install command that is executed during setup—RUN curl -fsSL https://claude.ai/install.sh | bash—which fetches and runs remote code (the Claude CLI installer) that the harness recommends/relies on for the Claude headless adapter.
Audit Metadata