dspy-evaluation-suite

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: CRITICALPROMPT_INJECTION
Full Analysis
  • Indirect Prompt Injection (LOW): The skill possesses a surface for indirect prompt injection through the processing of untrusted evaluation data.
  • Ingestion points: The devset input (list of dspy.Example objects) is processed by the agent in EvaluationSuite.evaluate and EvaluationSuite.compare.
  • Boundary markers: No explicit boundary markers or instructions to ignore embedded commands are present in the code or prompts.
  • Capability inventory: The skill utilizes dspy.Module which makes network calls to LLMs and has the Write tool enabled in the YAML manifest to export JSON reports via export_report.
  • Sanitization: No sanitization or validation of the text content within devset examples is performed before they are passed to the evaluator.
  • Security Scanners (SAFE): The automated scanner alert for logger.info (URL:Blacklist) is a false positive. The scanner misidentified standard Python logging calls logger.info(f"Evaluating: {name}") as a malicious domain or URL string.
Recommendations
  • Contains 1 malicious URL(s) - DO NOT USE
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 17, 2026, 06:18 PM