The Agent Skills Directory

[PROMPT_INJECTION]: The skill introduces an indirect prompt injection surface through the 'Model-Based Grader' and 'Capability Eval' features. These mechanisms ingest external code and task outputs to be processed by the agent, which could contain malicious instructions designed to manipulate the evaluation results.
Ingestion points: Processing of external data within [MODEL GRADER PROMPT] and [CAPABILITY EVAL] blocks as described in SKILL.md.
Boundary markers: The framework uses Markdown headers and code block delimiters to separate evaluation logic from data.
Capability inventory: The skill utilizes Bash, Read, Write, Edit, Grep, and Glob tools to perform its functions.
Sanitization: There is no mention of input sanitization or explicit 'ignore previous instructions' markers for the data being evaluated.
[COMMAND_EXECUTION]: The skill leverages the Bash tool to perform deterministic checks, such as running npm test, npm run build, and grep. While these are powerful capabilities, they are standard for development evaluation workflows and are used here to automate verification against success criteria.

eval-harness