stress-test

Warn

Audited by Gen Agent Trust Hub on Feb 20, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTION
Full Analysis
  • [EXTERNAL_DOWNLOADS] (MEDIUM): The skill identifies dependencies from a technical plan and uses the Bash tool to install them (e.g., 'npm install dep' in Phase 5). These packages are determined at runtime and are unverifiable, posing a supply-chain risk.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): Phase 5 involves writing and executing code ('node test.js') based on generated POC specs. Although user approval is requested via AskUserQuestion, the execution of dynamically generated code based on external documentation is a high-risk pattern.
  • [COMMAND_EXECUTION] (LOW): The skill uses the Bash tool to create directories and run tests. It explicitly instructs the agent to batch shell operations (e.g., 'mkdir -p dir && cd dir && npm init -y') which increases the impact of a single command block if malicious instructions are injected.
  • [INDIRECT_PROMPT_INJECTION] (LOW):
  • Ingestion points: Phase 2 uses WebSearch and WebFetch to read external documentation, which could contain adversarial instructions.
  • Boundary markers: Absent. The instructions do not specify how to distinguish documentation content from agent instructions.
  • Capability inventory: Subprocess calls (Bash), file-write (Write), and task orchestration (Task).
  • Sanitization: Absent. Web content is processed directly to verify claims.
  • [PRIVILEGE_ESCALATION] (LOW): Uses 'rm -rf' for cleanup. While targeted at a specific directory ('.poc-stress-test/'), such commands can be dangerous if the path is manipulated.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 20, 2026, 11:11 PM