gepetto
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The skill instructs the agent to use external CLI tools with flags that intentionally disable security guardrails and human-in-the-loop approvals.
- Evidence: In
references/external-review.md, the Gemini CLI is invoked with--approval-mode yoloand the Codex CLI is used withcodex exec --full-auto. These flags allow the external subagents to execute actions or commands without user confirmation. - DATA_EXFILTRATION (MEDIUM): The skill reads the entire contents of a local file (
claude-plan.md) and sends it to external LLM providers (Google and OpenAI) via CLI commands. - Evidence:
references/external-review.mduses$(cat '<planning_dir>/claude-plan.md')to pipe full plan contents into external network calls. - PROMPT_INJECTION (LOW): The skill is highly vulnerable to Indirect Prompt Injection because it interpolates untrusted file content directly into prompts for other LLMs without any sanitization or boundary markers.
- Evidence (Mandatory Chain):
- Ingestion points:
references/external-review.md(readsclaude-plan.md),references/research-protocol.md(ingests web search results). - Boundary markers: Absent. The content is directly injected into the prompt string.
- Capability inventory: Command execution via Bash, file system writes, and network access (WebSearch/WebFetch).
- Sanitization: None. The skill assumes the plan and research results are safe.
- EXTERNAL_DOWNLOADS (SAFE): While the skill mentions the
geminiandcodexCLIs, it assumes they are already installed by the user rather than attempting to download/install them from untrusted sources at runtime.
Recommendations
- AI detected serious security threats
Audit Metadata