paid-agent

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: CRITICALPROMPT_INJECTIONDATA_EXFILTRATIONREMOTE_CODE_EXECUTION
Full Analysis
  • [PROMPT_INJECTION] (CRITICAL): User-provided variables like {repo-name} and {DESCRIPTION} are interpolated directly into shell commands without escaping or validation. A malicious user could provide a name like 'my-repo; rm -rf /' to execute arbitrary commands on the host system.
  • [DATA_EXFILTRATION] (HIGH): In Step 5, the skill executes git add -A followed by gh repo create --public and a push. This workflow automatically exposes all local files in the project directory to a public repository, including potentially sensitive environment variables, secrets, or SSH keys not explicitly ignored.
  • [REMOTE_CODE_EXECUTION] (HIGH): The skill orchestrates an LLM development loop (ralph-wiggum) to generate code and then proceeds to deploy that code to Railway and GitHub. This allows unvalidated code generated from untrusted natural language prompts to be executed and deployed to production infrastructure without a mandatory human-in-the-loop review.
  • [INDIRECT_PROMPT_INJECTION] (HIGH): Vulnerability Surface: (1) Ingestion Points: {DESCRIPTION} variable in the Skill argument; (2) Boundary Markers: None; (3) Capability Inventory: Bash, Skill, Read, and Write; (4) Sanitization: None. The skill treats external user input as a high-trust instruction set for an agent with full filesystem and execution access.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
CRITICAL
Analyzed
Feb 16, 2026, 10:59 AM