langgraph

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTION
Full Analysis
  • [Dynamic Execution] (HIGH): The 'Basic Agent Graph' section in SKILL.md contains a 'calculator' tool defined as '@tool def calculator(expression: str) -> str: return str(eval(expression))'. This implementation uses the eval() function on the expression argument, which is generated by the LLM based on user input. This facilitates arbitrary Python code execution if an attacker crafts a prompt that causes the LLM to pass malicious Python code into the tool.\n- [Indirect Prompt Injection] (LOW): The skill describes an architecture vulnerable to indirect prompt injection where untrusted data is processed by the agent.\n
  • Ingestion points: The 'messages' key in 'AgentState' (defined in SKILL.md) receives user input via 'app.invoke()'.\n
  • Boundary markers: No delimiters or safety instructions are present in the code examples to distinguish between system instructions and untrusted user data.\n
  • Capability inventory: The agent has access to the 'calculator' tool (using eval) and a 'search' tool (potential network access).\n
  • Sanitization: No input validation, escaping, or sanitization is performed on the 'expression' passed to 'eval()' or the 'query' passed to 'search()'.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:34 PM