agentic-rag

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: CRITICALREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • REMOTE_CODE_EXECUTION (CRITICAL): The 'calculator' tool implementation in the 'Tool-Using RAG Agent' section of SKILL.md uses the Python 'eval()' function on input provided by the agent. This allows an attacker who can influence the agent's input (via prompt injection) to execute arbitrary Python code on the host system.
  • PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection. 1. Ingestion points: state['question'] in the analyze_query node and context (retrieved docs) in the synthesize_answer node. 2. Boundary markers: Absent; untrusted data is directly interpolated into system prompts. 3. Capability inventory: Includes document retrieval (search_docs, search_tickets) and the critical eval() capability. 4. Sanitization: Absent; no escaping or validation of external content.
  • COMMAND_EXECUTION (HIGH): The eval() vulnerability facilitates command execution on the underlying operating system (e.g., via the 'os' or 'subprocess' modules), enabling full system compromise.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 16, 2026, 05:35 AM