phoenix-evals-new-metric
Pass
Audited by Gen Agent Trust Hub on Mar 21, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [SAFE]: The skill describes a legitimate development workflow for contributing to the Phoenix repository. All file paths and operations are consistent with the project's architecture.
- [COMMAND_EXECUTION]: The skill involves running standard build and development commands such as
make codegen-prompts,pnpm build, andpnpm tsx. These commands are used to compile generated code and execute local benchmarks within the project directory. - [PROMPT_INJECTION]: The skill includes explicit security and accuracy guidance for writing LLM prompts. It recommends using XML-style tags (e.g.,
<context>,<output>) to wrap user-provided data, which is a recognized mitigation against indirect prompt injection by ensuring the model can distinguish between instructions and data.
Audit Metadata