testing-specialist

Warn

Audited by Gen Agent Trust Hub on Mar 16, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The helper script references/domains/webapp-testing/scripts/with_server.py utilizes subprocess.Popen(..., shell=True) to execute commands provided through the --server command-line argument. This pattern is inherently risky as it allows for arbitrary shell command execution. If an agent is tasked with using this script and populates the command string with data derived from an untrusted source, it creates a direct path for shell command injection.
  • [PROMPT_INJECTION]: Multiple sub-agent prompt definitions, including references/domains/testing/integration-test-fixer.md, references/domains/testing/test-results-analyzer.md, and references/domains/testing/test-writer-fixer.md, present a surface for indirect prompt injection. These agents are instructed to ingest and analyze data from external sources such as API endpoints, database states, and execution logs. Because they also have high-privilege capabilities like file modification (MultiEdit, Write) and shell execution (Bash), a lack of input sanitization or explicit boundary markers in their instructions could allow malicious payloads hidden in the processed data to influence the agent's actions.
  • Ingestion points: api-tester.md (API responses and payloads), test-results-analyzer.md (raw test execution logs), integration-test-fixer.md (integrated frontend/backend outputs).
  • Boundary markers: Not present; the prompts generally lack delimiters or instructions to ignore embedded commands in the data being processed.
  • Capability inventory: Bash, MultiEdit, Write, Read, Grep, WebFetch, TodoWrite (distributed across various sub-agent definitions).
  • Sanitization: There is no evidence of input validation, escaping, or filtering before external content is incorporated into the agent's reasoning or action loops.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 16, 2026, 03:06 AM