eval-recipes-runner

Fail

Audited by Gen Agent Trust Hub on Mar 14, 2026

Risk Level: HIGHEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: Fetches the eval-recipes benchmark suite from Microsoft's official GitHub repository.
  • [REMOTE_CODE_EXECUTION]: Downloads and executes the uv package manager installation script from astral.sh, a well-known service.
  • [COMMAND_EXECUTION]: Executes shell commands to manage environment configurations, copy files, and run benchmark scripts via uv run.
  • [PROMPT_INJECTION]: Contains a surface for indirect prompt injection where user-supplied branch names or PR identifiers are interpolated into Dockerfiles and shell commands.
  • Ingestion points: User-provided branch names or PR numbers used in Dockerfile configuration and shell execution steps (SKILL.md).
  • Boundary markers: None present to delimit user-provided input from the execution context.
  • Capability inventory: The skill performs repository cloning, file system operations, and command execution via uv run (SKILL.md).
  • Sanitization: No explicit validation or sanitization of input strings is defined before their use in terminal commands.
Recommendations
  • HIGH: Downloads and executes remote code from: https://astral.sh/uv/install.sh - DO NOT USE without thorough review
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 14, 2026, 03:24 PM