eval-designer
Pass
Audited by Gen Agent Trust Hub on Apr 24, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill provides instructions and a framework for building LLM evaluation suites. All identified patterns are contextualized within safety testing guidelines.
- [PROMPT_INJECTION]: The skill includes examples of prompt injection attacks, such as "Ignore your instructions" and "Ignore previous instructions", as adversarial test cases for safety evaluation. These patterns are documented as inputs to be used for testing the target system and do not attempt to override the agent's own behavior.
- [COMMAND_EXECUTION]: Provides Python code snippets demonstrating the use of the
subprocessmodule to execute security tools likebanditfor static analysis of generated code. This is presented as a best practice for automated safety evaluation. - [EXTERNAL_DOWNLOADS]: Recommends the use of well-known third-party tools such as
promptfooandlangsmithfor building evaluation pipelines.
Audit Metadata