guidance
Fail
Audited by Gen Agent Trust Hub on Mar 14, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION]: The skill's examples for building 'ReAct Agents' (found in
SKILL.mdandreferences/examples.md) demonstrate a 'calculator' tool implemented using the Pythoneval()function. Specifically, the tool is defined aslambda expr: eval(expr). Since the input to this function is generated directly by the LLM (gen("action_input")), a malicious actor could use prompt injection to force the LLM to output executable code (e.g.,__import__('os').system('...')), leading to full system compromise. - [COMMAND_EXECUTION]: The inclusion of
eval()in the tool-execution workflow facilitates arbitrary command execution on the host environment. - [PROMPT_INJECTION]: Analysis of prompt construction across all files reveals a significant vulnerability surface for indirect prompt injection.
- Ingestion points: Untrusted data enters the context through variables like
question,text,message, andqueryin functions and templates. - Boundary markers: Most examples fail to use clear delimiters or instructions to the model to ignore instructions within the processed data (e.g.,
lm += f"Question: {question}\n\n"). - Capability inventory: The presence of the
eval()-based tool execution loop provides a high-impact capability for any successful injection to exploit. - Sanitization: There is no evidence of input validation, escaping, or filtering for external content before it is interpolated into the prompt sequence.
Recommendations
- AI detected serious security threats
Audit Metadata