claude-api

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • REMOTE_CODE_EXECUTION (HIGH): The calculatorTool in templates/tool-use-advanced.ts uses the JavaScript eval() function to evaluate mathematical expressions provided by the AI.
  • Evidence: const result = eval(input.expression); within the run method of the calculatorTool definition.
  • Risk: Although the code includes a comment warning about the dangers of eval(), providing it in a functional template makes it highly likely to be adopted in production. Since the AI generates the expression input based on user prompts, an attacker could use indirect prompt injection to execute arbitrary code on the agent's host system.
  • PROMPT_INJECTION (LOW): The skill demonstrates a vulnerability to indirect prompt injection by exposing high-privilege capabilities to untrusted data without sanitization.
  • Ingestion points: The expression property in the calculatorTool input schema (templates/tool-use-advanced.ts).
  • Boundary markers: Absent; the templates do not implement delimiters or 'ignore' instructions for the tool inputs.
  • Capability inventory: Host-level JavaScript execution via the eval() call in templates/tool-use-advanced.ts.
  • Sanitization: Absent; the template relies on a code comment rather than implementing a safe mathematical parser or input validation.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 04:42 PM