setup-ralph

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATIONCREDENTIALS_UNSAFE
Full Analysis
  • [PROMPT_INJECTION] (HIGH): The skill is susceptible to Indirect Prompt Injection because it is designed to ingest and act upon instructions found in project specifications (specs/) and source code (src/).
  • Ingestion points: The agent is explicitly instructed to study all files in the specs/ and src/ directories.
  • Boundary markers: Absent. There are no delimiters or instructions to treat data in these directories as untrusted or separate from system instructions.
  • Capability inventory: The agent has full write/execute capabilities through the claude-code toolset, further amplified by the --dangerously-skip-permissions flag.
  • Sanitization: Absent. The agent directly executes logic derived from the contents of the processed files.
  • [COMMAND_EXECUTION] (HIGH): The orchestration scripts (loop.sh and loop-docker.sh) execute the Claude CLI with the --dangerously-skip-permissions flag. This removes the 'human-in-the-loop' safety barrier, allowing an autonomous agent to execute any shell command or tool without user approval.
  • [DATA_EXFILTRATION] (MEDIUM): The skill includes functionality to automatically create a private GitHub repository and push code after every iteration using the gh CLI. While intended for backup, this creates a pre-configured channel for exfiltrating sensitive data if the agent is subverted via prompt injection.
  • [CREDENTIALS_UNSAFE] (MEDIUM): The loop-docker.sh script reads the user's Claude OAuth token from ~/.claude-oauth-token. Although it performs a basic permission check (600), it passes this sensitive credential as an environment variable into the Docker container.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 01:03 AM