agentic-rag
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- REMOTE_CODE_EXECUTION (CRITICAL): The 'calculator' tool implementation in the 'Tool-Using RAG Agent' section of
SKILL.mduses the Python 'eval()' function on input provided by the agent. This allows an attacker who can influence the agent's input (via prompt injection) to execute arbitrary Python code on the host system. - PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection. 1. Ingestion points:
state['question']in theanalyze_querynode andcontext(retrieved docs) in thesynthesize_answernode. 2. Boundary markers: Absent; untrusted data is directly interpolated into system prompts. 3. Capability inventory: Includes document retrieval (search_docs, search_tickets) and the criticaleval()capability. 4. Sanitization: Absent; no escaping or validation of external content. - COMMAND_EXECUTION (HIGH): The
eval()vulnerability facilitates command execution on the underlying operating system (e.g., via the 'os' or 'subprocess' modules), enabling full system compromise.
Recommendations
- AI detected serious security threats
Audit Metadata