recall

Fail

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: CRITICALCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (CRITICAL): The execution logic uses a shell command template python ... --query "<ARGS>" where <ARGS> is replaced by raw user input. This allows for classic shell breakout (e.g., using "; [malicious command] #") to execute arbitrary commands on the underlying system.
  • [REMOTE_CODE_EXECUTION] (HIGH): The command injection vulnerability can be leveraged to download and execute remote payloads (e.g., via curl | bash), providing full remote control over the agent's execution environment.
  • [PROMPT_INJECTION] (HIGH): The skill is susceptible to Indirect Prompt Injection (Category 8). It retrieves historical data from a database and injects it directly into the agent's current context.
  • Ingestion points: Data is pulled from a PostgreSQL database via scripts/core/recall_learnings.py.
  • Boundary markers: None. The output is presented as plain markdown, allowing instructions embedded in historical data to blend with system instructions.
  • Capability inventory: The system uses uv run to execute Python scripts and has access to environment variables like $CLAUDE_OPC_DIR.
  • Sanitization: None detected. The skill returns "full content" of matched learnings without filtering for malicious directives.
  • [DATA_EXFILTRATION] (MEDIUM): While the retrieval of past learnings is a core feature, the lack of input sanitization combined with command execution capabilities allows an attacker to query and exfiltrate the entire semantic memory database or local sensitive files.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 15, 2026, 11:41 PM