execute-task

Pass

Audited by Gen Agent Trust Hub on Feb 28, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill exhibits a significant surface for Indirect Prompt Injection because it processes and executes tasks based on untrusted user input or external files.
  • Ingestion points: In SKILL.md (Phase 0), the $ARGUMENTS variable is resolved from a user-provided string or a file path (e.g., TASK_N.md), which is then treated as the primary instruction set for the workflow.
  • Boundary markers: The resolved task content is placed within a <task> XML-style tag, but there are no explicit system instructions or delimiters to prevent the agent from obeying instructions embedded within that data.
  • Capability inventory: The orchestrator has access to powerful tools including file reading (read_file), codebase searching (grep, glob), shell execution (tsc, lint, test, build), and browser automation via MCPs.
  • Sanitization: The skill lacks any mechanism to sanitize or validate the content of the task document before it is used to drive the multi-phase implementation process.
  • [COMMAND_EXECUTION]: The skill is designed to programmatically discover and execute shell commands from the project environment.
  • Execution logic: Phase 5 of SKILL.md instructs the agent to find and run scripts defined in package.json (e.g., npm run build, turbo run test).
  • Testing capabilities: The verification protocol in references/verification-protocol.md uses curl to perform network requests and record responses, providing a path for arbitrary network interaction if the target URL is manipulated.
  • [EXTERNAL_DOWNLOADS]: The skill incorporates tools capable of fetching remote content at runtime.
  • Network tools: It explicitly utilizes the firecrawl MCP to fetch external documentation and references from the web during the role assignment phase.
  • Documentation lookup: It uses the context7 MCP to resolve and query library documentation, which involves fetching data from remote registries.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 28, 2026, 06:10 PM