agent-evaluation

Fail

Audited by Snyk on Mar 3, 2026

Risk Level: HIGH
Full Analysis

HIGH W007: Insecure credential handling detected in skill instructions.

  • Insecure credential handling detected (high risk: 0.90). The rubric repeatedly requires "direct quote" or file:line evidence for every dimension and mandates quoting specific text from evaluated outputs, which would force the agent to reproduce verbatim file contents (including any API keys/secrets) if those appear in the evidence, creating a high exfiltration risk.
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 3, 2026, 09:02 AM