multi-model-planning

Fail

Audited by Gen Agent Trust Hub on May 2, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The skill provides instructions to install the Cursor Agent CLI by fetching an installation script from https://cursor.com/install and piping it directly to the bash shell.
  • [COMMAND_EXECUTION]: The skill makes extensive use of shell tools to manage temporary workspace directories in /tmp/, manipulate plan files using cat and tee, and execute multiple external AI CLI utilities (agent, claude, codex) with arguments constructed from local file content.
  • [DATA_EXFILTRATION]: Technical plans and codebase snippets are transmitted to external third-party AI providers (Anthropic, OpenAI, Cursor) via their respective command-line interfaces for the purpose of conducting multi-model consensus reviews.
  • [PROMPT_INJECTION]:
  • The skill defines a sub-agent workflow using templates (assets/reviewer-prompt-template.md) that employ role-play instructions, directing the AI to act as a senior engineer.
  • Indirect Prompt Injection Surface:
  • Ingestion points: The skill reads and processes plan content from generated files stored in the temporary planning directory.
  • Boundary markers: Employs a specific markdown header (# Proposal to review) to delineate instructions from the data being reviewed in the prompt template.
  • Capability inventory: The skill possesses capabilities to execute shell commands, write to the file system, and invoke external network-connected CLIs.
  • Sanitization: There is no evidence of explicit sanitization or escaping of the plan content before it is interpolated into prompts for the reviewer models.
Recommendations
  • HIGH: Downloads and executes remote code from: https://cursor.com/install - DO NOT USE without thorough review
Audit Metadata
Risk Level
HIGH
Analyzed
May 2, 2026, 01:54 PM