multi-agent-e2e-validation

Pass

Audited by Gen Agent Trust Hub on Feb 28, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection through the Architecture Decision Record (ADR) documents it ingests to create validation plans.
  • Ingestion points: External project documentation such as ADR files (e.g., 'ADR-0002 QuestDB Refactor') processed in Step 1.
  • Boundary markers: Absent; there are no delimiters or instructions provided to the agent to distinguish between the ADR content and system instructions.
  • Capability inventory: The skill utilizes 'Bash' and 'Write' tools to generate and run code based on the interpreted plans.
  • Sanitization: None; the skill does not include steps to sanitize or validate input data before using it to define test logic.
  • [COMMAND_EXECUTION]: The core methodology involves the dynamic generation and execution of Python scripts ('test_*.py') to perform system validation.
  • The 'Agent Orchestration Pattern' and Step 6 describe using Bash heredocs to iterate through directories and execute all discovered scripts using 'uv run'.
  • This automated execution of dynamically generated content is a primary feature of the skill but increases the risk if the script generation process is influenced by untrusted input.
  • [EXTERNAL_DOWNLOADS]: The skill utilizes 'uv' and 'Docker' for environment setup and script execution.
  • These tools may download external dependencies, packages from PyPI, or container images from Docker Hub during the 'Agent 1' setup or when running tests with 'uv run'.
  • While these downloads target well-known services, the specific packages or images are not pinned or verified within the skill's instructions.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 28, 2026, 03:58 AM