The Agent Skills Directory

[PROMPT_INJECTION]: The skill is designed to process untrusted external data (outputs from other LLMs) to perform evaluation. This creates a surface for indirect prompt injection where the data being judged could attempt to influence the evaluator agent's behavior.
Ingestion points: Untrusted LLM responses are passed into evaluation prompts in references/implementation-patterns.md and scripts/evaluation_example.py.
Boundary markers: The implementation patterns in references/full-guide.md and scripts/evaluation_example.py use clear structural delimiters (e.g., markdown headers like '## Response to Evaluate') to isolate untrusted content from the instructions.
Capability inventory: Across all scripts and guides, the skill focuses on data processing, scoring logic, and comparison. There are no subprocess calls, file-write operations, or network exfiltration capabilities applied to the untrusted data.
Sanitization: The skill relies on structural delimiters and prompt instructions (like 'Do NOT prefer responses because they are longer') rather than programmatic sanitization or escaping of the input data.

advanced-evaluation