agent-eval-harness
Pass
Audited by Gen Agent Trust Hub on Apr 14, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION]: The harness facilitates the execution of agents and grader scripts via shell commands. The documentation proactively warns users about command injection risks when processing untrusted input in specific modes and recommends using schema-driven adapters for better security.
- [EXTERNAL_DOWNLOADS]: The documentation includes references to official resources from trusted providers like Anthropic and well-known services like Google for environment setup. These are standard integration steps and do not represent a security risk.
- [REMOTE_CODE_EXECUTION]: The skill allows the execution of remote packages via bunx and provides instructions for installing official CLI tools from trusted domains. These operations are part of the intended installation and execution workflow.
Audit Metadata