ai-agent-basics
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION] (HIGH): Insecure code execution via
eval()inscripts/react_agent.py. - Evidence: File
scripts/react_agent.py, Line 84:result = eval(tool_input["expression"], {"__builtins__": {}}, {}). - Risk: Python's
eval()is fundamentally difficult to sandbox. Attackers can use introspection techniques (e.g., accessing__class__attributes of basic types) to retrieve references to theosmodule or other dangerous built-ins, bypassing the provided restrictions to execute arbitrary system commands. - [REMOTE_CODE_EXECUTION] (HIGH): Indirect RCE through LLM tool calling.
- Ingestion Point: User-provided
queryinscripts/react_agent.pyenters the LLM context. - Capability Inventory: The
calculatortool allows the LLM to execute Python code viaeval(). - Risk: An attacker can use prompt injection to trick the LLM into generating a malicious Python payload. Because the LLM's output is directly evaluated, this creates a path for remote code execution.
- [EXTERNAL_DOWNLOADS] (LOW): Dependency on external libraries.
- Evidence:
SKILL.mdandscripts/react_agent.pyreferencelanggraphandlangchain-anthropic. - Status: These are trusted libraries from known organizations, though not on the explicit whitelist. Usage is standard for the skill's purpose.
Recommendations
- AI detected serious security threats
Audit Metadata