python-sdk
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTION
Full Analysis
- REMOTE_CODE_EXECUTION (HIGH): The documentation in
references/tool-builder.mdandreferences/agent-patterns.mdprovides examples for handling tool calls usingeval(call.args['expression']). This allows an AI agent (which can be manipulated via prompt injection) to execute arbitrary Python code on the host machine. - Evidence:
references/tool-builder.mdline 228 andreferences/agent-patterns.md(Handling Tool Calls section). - COMMAND_EXECUTION (MEDIUM): The skill documents the
internal_tools().code_execution(True)feature. While a legitimate SDK capability, enabling this without strict sandboxing allows the agent to execute code, which can be exploited by an attacker via indirect prompt injection. - Evidence:
references/agent-patterns.mdline 80 andreferences/tool-builder.mdline 194. - DATA_EXFILTRATION (LOW): The
webhook_toolpattern inreferences/tool-builder.mddemonstrates sending data to external URLs (e.g., Slack, GitHub). Without validation of the destination URL or the data being sent, this pattern can be used to exfiltrate sensitive information if the agent is compromised. - Evidence:
references/tool-builder.mdlines 156-174. - INDIRECT PROMPT INJECTION (LOW): The skill lacks documentation on sanitizing data received from external sources (e.g.,
app_toolresults or web search) before passing it back into the agent context, creating a vulnerability surface for indirect prompt injection. - Evidence: Found across all pattern files involving external data ingestion (RAG patterns, file processing).
Recommendations
- AI detected serious security threats
Audit Metadata