agentation

Fail

Audited by Gen Agent Trust Hub on Mar 22, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The skill provides instructions for configuring automated hooks in agent-specific settings files (e.g., .claude/settings.json, ~/.gemini/settings.json). These hooks execute shell commands during every turn, using curl to fetch data from a local port (http://localhost:4747/pending) and piping it to python3. Automated scans detected instances of these commands being piped to a bare python3 interpreter, which allows for arbitrary code execution if the local server returns malicious scripts.
  • [COMMAND_EXECUTION]: The skill instructs the agent to modify sensitive system-level configuration files for various AI tools (Claude Code, Gemini CLI, Codex) to inject persistent shell execution patterns. This behavior automates the execution of third-party logic across all user interaction sessions.
  • [EXTERNAL_DOWNLOADS]: The skill uses npx -y agentation-mcp server, which downloads and executes packages from the npm registry without verifying the package integrity or requiring user confirmation for the specific version.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection (Category 8). It establishes a 'watch loop' where the agent processes human-provided UI annotations and is explicitly instructed to 'Make the minimal change described in the comment.'
  • Ingestion points: Data enters the agent context via the agentation_get_pending or agentation_watch_annotations tools, which fetch comments from a local HTTP server.
  • Boundary markers: None. The provided instructions do not include any delimiters or safety warnings to ignore embedded instructions within the user comments.
  • Capability inventory: The skill uses high-privilege tools (Bash, Write, Grep) that grant the agent full filesystem access to execute the untrusted instructions.
  • Sanitization: No validation or sanitization of the user comments is performed, allowing an attacker to embed malicious commands that the agent will execute as part of its fix routine.
Recommendations
  • HIGH: Downloads and executes remote code from: http://localhost:4747/pending - DO NOT USE without thorough review
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 22, 2026, 08:39 AM