build-evaluator
Pass
Audited by Gen Agent Trust Hub on Apr 10, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it processes untrusted data (inputs and outputs being evaluated) within LLM prompts and Python code. Ingestion points include variables like {{input}} and {{output}} in resources/judge-prompt-template.md and the log parameter in SKILL.md. While markdown is used for separation, there are no specific instructions for the model to disregard embedded commands. The skill possesses capabilities such as Bash, WebFetch, and file writing. No sanitization of external content is documented.
- [COMMAND_EXECUTION]: The skill uses Bash and curl to interact with the api.orq.ai infrastructure. These commands are used to manage evaluators and are consistent with the vendor's provided tools and platform functionality.
Audit Metadata