langgraph
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: CRITICALCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- Command Execution (CRITICAL): The
calculatortool implementation inSKILL.mduses the Pythoneval()function on theexpressionargument. This represents a critical vulnerability as an attacker can use prompt injection to execute arbitrary code (e.g.,__import__('os').system('ls')) instead of performing mathematical calculations. \n- Dynamic Execution (CRITICAL): The use ofeval()on unsanitized strings is an unsafe dynamic execution pattern that allows for arbitrary code execution within the agent's runtime environment. \n- Indirect Prompt Injection (HIGH): The skill exposes a dangerous execution capability to untrusted input without any sanitization or validation. \n - Ingestion points: The
expressionargument of thecalculatortool inSKILL.md. \n - Boundary markers: None. \n
- Capability inventory: Arbitrary Python execution via
eval(). \n - Sanitization: None.
Recommendations
- AI detected serious security threats
Audit Metadata