dspy-best-of-n
Fail
Audited by Gen Agent Trust Hub on Mar 17, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION]: The skill documentation (
SKILL.md) and examples (examples.md) provide code snippets that use the Pythonexec()function to execute logic generated by an AI model. This creates a critical vulnerability where an attacker could use prompt injection to trick the model into generating and executing malicious code on the host system. - [PROMPT_INJECTION]: The skill processes untrusted output from an LLM and uses it in highly privileged operations, creating a significant indirect prompt injection surface.
- Ingestion points: The skill ingests AI-generated code and text via
pred.codeandpred.answerin bothSKILL.mdandexamples.md. - Boundary markers: There are no delimiters or instructions provided to differentiate between intended data and potentially malicious instructions embedded in the AI output.
- Capability inventory: The skill examples utilize the
exec()function inSKILL.mdandexamples.mdto run code produced by the model. - Sanitization: No sanitization, syntax validation, or effective sandboxing is implemented for the AI-generated code before execution.
Recommendations
- AI detected serious security threats
Audit Metadata