gepetto

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill instructs the agent to use external CLI tools with flags that intentionally disable security guardrails and human-in-the-loop approvals.
  • Evidence: In references/external-review.md, the Gemini CLI is invoked with --approval-mode yolo and the Codex CLI is used with codex exec --full-auto. These flags allow the external subagents to execute actions or commands without user confirmation.
  • DATA_EXFILTRATION (MEDIUM): The skill reads the entire contents of a local file (claude-plan.md) and sends it to external LLM providers (Google and OpenAI) via CLI commands.
  • Evidence: references/external-review.md uses $(cat '<planning_dir>/claude-plan.md') to pipe full plan contents into external network calls.
  • PROMPT_INJECTION (LOW): The skill is highly vulnerable to Indirect Prompt Injection because it interpolates untrusted file content directly into prompts for other LLMs without any sanitization or boundary markers.
  • Evidence (Mandatory Chain):
  • Ingestion points: references/external-review.md (reads claude-plan.md), references/research-protocol.md (ingests web search results).
  • Boundary markers: Absent. The content is directly injected into the prompt string.
  • Capability inventory: Command execution via Bash, file system writes, and network access (WebSearch/WebFetch).
  • Sanitization: None. The skill assumes the plan and research results are safe.
  • EXTERNAL_DOWNLOADS (SAFE): While the skill mentions the gemini and codex CLIs, it assumes they are already installed by the user rather than attempting to download/install them from untrusted sources at runtime.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 04:36 PM