manual-testing

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (LOW): The skill instructs the agent to run arbitrary shell commands such as npm run build, cargo test, and go build. While these are essential for the skill's purpose, they are vectors for code execution if a malicious actor controls the scripts defined in the project's configuration files.
  • EXTERNAL_DOWNLOADS (LOW): The skill suggests using curl and httpie for testing API responses. While the examples focus on localhost, these tools could be used to interact with external untrusted domains.
  • PROMPT_INJECTION (LOW): The skill is vulnerable to Indirect Prompt Injection (Category 8). It explicitly asks the agent to ingest and analyze untrusted data from "recent file changes and conversation history" without boundary markers or sanitization. This could allow an attacker to embed instructions in a codebase that the agent then executes during the automated testing phase.
  • Ingestion points: SKILL.md instructions for Step 1 ("Review recent file changes").
  • Boundary markers: Absent; no instructions to ignore instructions inside analyzed files.
  • Capability inventory: Full shell execution capability (npm, pytest, curl, bash blocks).
  • Sanitization: Absent; the agent executes commands directly derived from the environment's context.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 05:14 PM