plan-implementation

Warn

Audited by Gen Agent Trust Hub on Feb 28, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill explicitly utilizes shell commands for various development tasks, including syntax validation (e.g., 'php -l', 'node --check', 'kotlinc'), running test suites (e.g., 'php artisan test', './gradlew', 'npm test'), and Git management ('git add', 'git commit', 'git push').
  • [PROMPT_INJECTION]: The instructions establish a highly autonomous persona with 'full executive authority' and mandates to 'not stop' or 'ask for permission'. This behavioral framing may lead the agent to prioritize task completion over safety guardrails or user intervention during execution.
  • [REMOTE_CODE_EXECUTION]: The core development loop involves the agent generating both application logic and test files, followed by immediate execution through system-level test runners. This represents the automated execution of self-generated code.
  • [DATA_EXFILTRATION]: Automated 'git push' operations are triggered at the end of each development phase. This transmits the codebase to a remote repository, which could inadvertently include sensitive information if the agent's internal security checks fail to identify it.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as its entire workflow is driven by parsing external Markdown files located in 'docs/plans/'. Maliciously crafted plans could influence the agent to perform unauthorized actions under the guise of legitimate tasks.
  • Ingestion points: Plan files located in 'docs/plans/' (e.g., 'docs/plans/YYYY-MM-DD-[feature-name].md').
  • Boundary markers: None identified; the agent is instructed to parse the plan completely for task extraction.
  • Capability inventory: File system access, shell execution (compilers, test runners, git), and network access via 'git push'.
  • Sanitization: No explicit sanitization or external validation of the plan file content is described beyond basic parsing.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 28, 2026, 12:09 PM