llm-evaluation

Pass

Audited by Gen Agent Trust Hub on Feb 28, 2026

Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill's implementation of LLM-as-Judge patterns creates an attack surface for indirect prompt injection.
  • Ingestion points: The functions llm_judge_quality and compare_responses in SKILL.md ingest external data via parameters representing user questions and model responses.
  • Boundary markers: The code examples lack clear delimiters or specific instructions to the evaluator model to ignore potential commands within the data being analyzed.
  • Capability inventory: These functions utilize the OpenAI API to process the constructed prompts, extending the execution flow to an external model.
  • Sanitization: The implementation does not include any sanitization or filtering logic to prevent malicious instructions embedded in the evaluated responses from influencing the judge model's behavior.
  • [EXTERNAL_DOWNLOADS]: The skill fetches pre-trained models from Microsoft's official Hugging Face repository for use in evaluation metrics such as BERTScore and groundedness checks.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 28, 2026, 11:52 AM