paid-agent
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: CRITICALPROMPT_INJECTIONDATA_EXFILTRATIONREMOTE_CODE_EXECUTION
Full Analysis
- [PROMPT_INJECTION] (CRITICAL): User-provided variables like
{repo-name}and{DESCRIPTION}are interpolated directly into shell commands without escaping or validation. A malicious user could provide a name like 'my-repo; rm -rf /' to execute arbitrary commands on the host system. - [DATA_EXFILTRATION] (HIGH): In Step 5, the skill executes
git add -Afollowed bygh repo create --publicand a push. This workflow automatically exposes all local files in the project directory to a public repository, including potentially sensitive environment variables, secrets, or SSH keys not explicitly ignored. - [REMOTE_CODE_EXECUTION] (HIGH): The skill orchestrates an LLM development loop (
ralph-wiggum) to generate code and then proceeds to deploy that code to Railway and GitHub. This allows unvalidated code generated from untrusted natural language prompts to be executed and deployed to production infrastructure without a mandatory human-in-the-loop review. - [INDIRECT_PROMPT_INJECTION] (HIGH): Vulnerability Surface: (1) Ingestion Points:
{DESCRIPTION}variable in the Skill argument; (2) Boundary Markers: None; (3) Capability Inventory:Bash,Skill,Read, andWrite; (4) Sanitization: None. The skill treats external user input as a high-trust instruction set for an agent with full filesystem and execution access.
Recommendations
- AI detected serious security threats
Audit Metadata