agent-test

Pass

Audited by Gen Agent Trust Hub on Mar 30, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses the tdx agent test command to execute agent conversations and evaluate them against defined criteria.
  • [PROMPT_INJECTION]: Potential for indirect prompt injection as the tool processes user_input and criteria from test.yml files. Malicious content in these files could attempt to influence the agent under test or the judge agent performing the evaluation.
  • Ingestion points: user_input and criteria in test.yml.
  • Boundary markers: None explicitly mentioned in the documentation.
  • Capability inventory: Executes shell commands (tdx), interacts with judge agents, and reads local agent configuration files (agent.yml, prompt.md).
  • Sanitization: None explicitly documented for the input fields.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 30, 2026, 11:38 AM