qa-test

Pass

Audited by Gen Agent Trust Hub on Mar 13, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes local developer tools including git diff and the GitHub CLI (gh pr view, gh issue view) to automatically identify changes and gather acceptance criteria.
  • [COMMAND_EXECUTION]: It may invoke project-specific scripts found in package.json (such as npm run dev) to ensure the application is running before beginning tests.
  • [COMMAND_EXECUTION]: If database tools are available, the skill utilizes them (e.g., mcp__postgres__execute_sql) to perform state verification after browser-based interactions.
  • [PROMPT_INJECTION]: The skill is subject to indirect prompt injection risks due to its data ingestion patterns:
  • Ingestion points: Fetches acceptance criteria from potentially untrusted sources like Pull Request descriptions and GitHub issues.
  • Boundary markers: The sub-agent prompt template for testing does not use explicit delimiters to isolate the injected criteria from the agent's core instructions.
  • Capability inventory: The agent has extensive capabilities including browser interaction, script execution, and database access.
  • Sanitization: There is no evidence of filtering or validation of the criteria retrieved from project metadata.
  • [REMOTE_CODE_EXECUTION]: The skill utilizes the evaluate_script tool within Chrome DevTools to execute arbitrary JavaScript in the context of the browser. This is legitimately used to trigger UI events that are difficult to automate (e.g., React's onMouseEnter), but represents a significant execution capability.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 13, 2026, 11:13 PM