NYC

claude-agent-sdk

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION] (CRITICAL): The file templates/custom-mcp-server.ts defines a calculate tool that executes arbitrary JavaScript using eval(args.expression). Since the expression is provided by the AI agent (which can be influenced by user prompts or indirect injection), this allows for complete host compromise.
  • [COMMAND_EXECUTION] (HIGH): Multiple templates (templates/multi-agent-workflow.ts, templates/permission-control.ts) demonstrate the use of the Bash tool. While they implement a canUseTool check, it uses a naive substring-based blacklist (e.g., checking for 'rm -rf'). This is easily bypassed by attackers using command obfuscation (e.g., rm "-rf" / or variable interpolation) or non-blacklisted destructive commands.
  • [PROMPT_INJECTION] (HIGH): The skill is designed to build agents that ingest untrusted data from a codebase (via Read, Grep, Glob) and project instructions (via CLAUDE.md).
  • Ingestion points: File system reads in templates/filesystem-settings.ts and templates/query-with-tools.ts.
  • Boundary markers: None. Templates interpolate instructions directly into system prompts without delimiters or isolation.
  • Capability inventory: Full filesystem access (Write, Edit), system command execution (Bash), and network access via custom MCP tools.
  • Sanitization: None provided for file content or tool outputs.
  • [PRIVILEGE_ESCALATION] (HIGH): The templates/permission-control.ts and references/permissions-guide.md explicitly promote the use of permissionMode: "bypassPermissions". While labeled with a warning, providing a template that skips all safety checks for autonomous agents creates a high risk of accidental or malicious privilege escalation in deployment environments.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 15, 2026, 11:31 PM