python-sdk
Fail
Audited by Gen Agent Trust Hub on Mar 12, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The documentation for the SDK (specifically in
references/tool-builder.mdandreferences/agent-patterns.md) provides examples where the Pythoneval()function is used to process mathematical expressions or logic provided by the AI agent through tool calls. Since the agent's output is essentially untrusted input, usingeval()allows the agent to execute arbitrary Python code on the host machine.\n- [REMOTE_CODE_EXECUTION]: The SDK includes a built-incode_executioninternal tool (internal_tools().code_execution(True)) that explicitly allows an AI agent to write and run code within its execution environment. While a documented feature, this capability effectively allows for arbitrary code execution.\n- [EXTERNAL_DOWNLOADS]: Documentation examples inreferences/files.mddemonstrate the use of therequestslibrary to download files from remote URLs, andSKILL.mdshows the referencing of external skill content via URLs, which involves fetching and potentially processing data from the public internet.\n- [PROMPT_INJECTION]: The skill exhibits an attack surface for indirect prompt injection due to its extensive capabilities and ingestion of external data. 1. Ingestion points: Ingests untrusted data throughagent.send_messageand tool results. 2. Boundary markers: None are present in the provided code examples to delimit user/agent instructions from data. 3. Capability inventory: Includes arbitrary code execution, filesystem read/write (e.g.,delete_fileandsave_chatpatterns), and network operations via webhooks. 4. Sanitization: The examples show no evidence of sanitization or validation of arguments generated by the LLM before they are passed to sensitive functions likeeval()or used in filesystem operations.
Recommendations
- AI detected serious security threats
Audit Metadata