langgraph
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The 'calculator' tool in SKILL.md utilizes 'eval(expression)' to process mathematical queries. Since 'expression' is a string generated by the LLM (which is influenced by untrusted user input), an attacker can perform command injection by tricking the agent into executing malicious Python commands.
- [REMOTE_CODE_EXECUTION] (HIGH): The 'eval()' sink represents a direct Remote Code Execution (RCE) vulnerability. In the context of an agentic workflow, this is an 'Indirect Prompt Injection' surface where malicious instructions in a user prompt are translated by the LLM into executable code passed to the tool.
Recommendations
- AI detected serious security threats
Audit Metadata