langgraph

Fail

Audited by Gen Agent Trust Hub on Mar 13, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The code example in Step 3 of SKILL.md defines a calculate tool that uses the Python eval() function to process a string input named expression. This function executes any string passed to it as Python code. If an AI agent using this tool is provided with a malicious expression by a user, it could lead to arbitrary code execution on the system where the agent is running.
  • [COMMAND_EXECUTION]: The use of eval() in the provided tool definition allows for the execution of arbitrary Python commands, which can be used to bypass security restrictions or interact with the underlying operating system.
  • [PROMPT_INJECTION]: The skill exhibits a surface for indirect prompt injection through the calculate tool.
  • Ingestion points: Untrusted data enters the tool via the expression parameter in the calculate function (SKILL.md).
  • Boundary markers: There are no markers or instructions provided to delimit user input or prevent the execution of embedded instructions.
  • Capability inventory: The tool has the capability to execute code via eval() (SKILL.md).
  • Sanitization: No sanitization, validation, or escaping is performed on the input string before it is passed to the execution function.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 13, 2026, 09:15 PM