eval-harness
Pass
Audited by Gen Agent Trust Hub on Mar 13, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill defines a 'Model-Based Grader' workflow where the agent is instructed to evaluate code changes or task outputs. This creates a surface for indirect prompt injection if the content being evaluated contains malicious instructions designed to manipulate the grader's judgment or trigger unintended tool use.
- Ingestion points: Task outputs, code files, and test results processed during the '/eval check' and '/eval report' workflows.
- Boundary markers: Uses specific tags such as '[MODEL GRADER PROMPT]' to delineate instructions, but does not implement strict isolation or 'ignore' directives for the data being graded.
- Capability inventory: The skill requires 'Read', 'Write', 'Edit', 'Bash', 'Grep', and 'Glob' tools, providing significant system access that could be exploited if an injection succeeds.
- Sanitization: No explicit sanitization, filtering, or instruction-ignoring protocols are defined in the framework to protect the grading prompt from embedded data.
Audit Metadata