eval-recipes-runner
Fail
Audited by Gen Agent Trust Hub on Mar 14, 2026
Risk Level: HIGHEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: Fetches the eval-recipes benchmark suite from Microsoft's official GitHub repository.
- [REMOTE_CODE_EXECUTION]: Downloads and executes the uv package manager installation script from astral.sh, a well-known service.
- [COMMAND_EXECUTION]: Executes shell commands to manage environment configurations, copy files, and run benchmark scripts via uv run.
- [PROMPT_INJECTION]: Contains a surface for indirect prompt injection where user-supplied branch names or PR identifiers are interpolated into Dockerfiles and shell commands.
- Ingestion points: User-provided branch names or PR numbers used in Dockerfile configuration and shell execution steps (SKILL.md).
- Boundary markers: None present to delimit user-provided input from the execution context.
- Capability inventory: The skill performs repository cloning, file system operations, and command execution via uv run (SKILL.md).
- Sanitization: No explicit validation or sanitization of input strings is defined before their use in terminal commands.
Recommendations
- HIGH: Downloads and executes remote code from: https://astral.sh/uv/install.sh - DO NOT USE without thorough review
Audit Metadata