python-sdk
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION] (HIGH): The documentation in
references/tool-builder.mdexplicitly demonstrates using Python'seval()function to handle tool arguments (e.g.,eval(call.args['expression'])). Since these arguments are generated by the LLM based on user prompts, this allows for arbitrary command execution.\n- [REMOTE_CODE_EXECUTION] (HIGH): Theinternal_tools().code_execution(True)pattern shown inreferences/agent-patterns.mdallows agents to write and run code. In a production environment, this capability requires strict sandboxing which is not discussed in the reference.\n- [DATA_EXFILTRATION] (MEDIUM): Thewebhook_toolcapability inreferences/tool-builder.mdenables agents to transmit data to arbitrary external URLs. This can be abused to leak sensitive information if an agent is tricked into sending secrets as webhook parameters.\n- [PROMPT_INJECTION] (LOW): The RAG and Webhook patterns create a surface for indirect prompt injection where untrusted external data influences agent behavior.\n - Ingestion points: Web search results (
agent-patterns.md) and Webhook responses (tool-builder.md).\n - Boundary markers: Absent in provided system prompts.\n
- Capability inventory: Filesystem deletion (
delete_file), arbitrary code execution (eval), and network egress (webhook_tool).\n - Sanitization: None demonstrated; external data is processed directly by the LLM.
Recommendations
- AI detected serious security threats
Audit Metadata