llm-evaluation
Pass
Audited by Gen Agent Trust Hub on May 7, 2026
Risk Level: SAFE
Full Analysis
- [PROMPT_INJECTION]: The skill contains 'LLM-as-Judge' implementation examples where model outputs are interpolated into evaluation prompts. This creates an inherent surface for indirect prompt injection if the evaluated content contains adversarial instructions. However, these are standard benchmarking patterns and do not target the agent's own safety guidelines.
- Ingestion points: The
responseandquestionvariables in thellm_judge_qualityandcompare_responsesfunctions inSKILL.md. - Boundary markers: Absent; responses are directly embedded in the f-string prompt.
- Capability inventory: The snippets demonstrate calls to the OpenAI ChatCompletion API.
- Sanitization: None provided in the simplified code examples.
- [EXTERNAL_DOWNLOADS]: The code snippets reference downloading and using the
microsoft/deberta-large-mnliandmicrosoft/deberta-xlarge-mnlimodels via the Hugging Facetransformerslibrary. These are official models from a well-known organization and are standard for Natural Language Inference tasks.
Audit Metadata