dspy-best-of-n

Fail

Audited by Gen Agent Trust Hub on Mar 17, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The skill documentation (SKILL.md) and examples (examples.md) provide code snippets that use the Python exec() function to execute logic generated by an AI model. This creates a critical vulnerability where an attacker could use prompt injection to trick the model into generating and executing malicious code on the host system.
  • [PROMPT_INJECTION]: The skill processes untrusted output from an LLM and uses it in highly privileged operations, creating a significant indirect prompt injection surface.
  • Ingestion points: The skill ingests AI-generated code and text via pred.code and pred.answer in both SKILL.md and examples.md.
  • Boundary markers: There are no delimiters or instructions provided to differentiate between intended data and potentially malicious instructions embedded in the AI output.
  • Capability inventory: The skill examples utilize the exec() function in SKILL.md and examples.md to run code produced by the model.
  • Sanitization: No sanitization, syntax validation, or effective sandboxing is implemented for the AI-generated code before execution.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 17, 2026, 06:59 PM