evaluation-metrics

Warn

Audited by Snyk on Mar 1, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.90). The skill explicitly loads public datasets (e.g., SKILL.md and Benchmark Suites: load_dataset("cais/mmlu") and assets/evaluation_config.yaml listing a Hugging Face "squad" source) and the runtime code (scripts/llm_evaluator.py, e.g. FaithfulnessMetric and HallucinationDetector) injects those untrusted context texts directly into prompts passed to an LLM to drive scoring/decisions, so third-party content can materially influence agent behavior.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 1, 2026, 07:47 PM