work-on-issue

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill dynamically constructs and executes shell commands in Steps 3 and 4 using variables (<NUMBER> and <short-description>) derived from $ARGUMENTS and external GitHub issue content. If an attacker-controlled GitHub issue contains shell metacharacters (e.g., backticks or semicolons), it can lead to arbitrary command execution.
  • Evidence (File: SKILL.md): .claude/scripts/worktree-create.sh issue-<NUMBER>-<short-description>
  • Evidence (File: SKILL.md): bash .claude/hooks/tdd-state.sh activate <NUMBER>
  • REMOTE_CODE_EXECUTION (HIGH): By exploiting the command injection vulnerability via a maliciously crafted GitHub issue, a remote attacker can execute code on the host machine where the agent is running.
  • PROMPT_INJECTION (LOW): The instructions use strong imperatives to override standard agent safety behavior and user consent loops.
  • Evidence (File: SKILL.md): "When this skill is invoked, you MUST execute these steps immediately. Do NOT just describe what will happen
  • actually do it."
  • Evidence (File: SKILL.md): "DO NOT stop and wait for user input. Start the TDD cycle now."
  • DATA_EXPOSURE & EXFILTRATION (SAFE): No evidence of credential exposure or unauthorized data transmission to external domains was found.
  • INDIRECT PROMPT INJECTION (LOW): The skill ingests untrusted data from GitHub issues and uses it to drive agent actions.
  • Ingestion points: issue-worker sub-agent reading GitHub issue content.
  • Boundary markers: Absent; no delimiters or "ignore instructions" warnings are used when processing issue text.
  • Capability inventory: Shell script execution (worktree-create.sh, tdd-state.sh), file creation (TodoWrite), and subsequent skill invocation (/write-tests).
  • Sanitization: Absent; the content is interpolated directly into commands and todo lists without escaping or validation.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:46 PM