pipeline-check

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION] (HIGH): The skill contains a directive to execute uv run pytest tests/benchmarks/test_benchmark_micro.py. This executes Python code from the local filesystem. In an agentic context where the repository source may be untrusted or tampered with, this provides a direct path for executing malicious code.
  • [PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8) because it processes untrusted repository data with execution-capable tools. Evidence Chain: 1. Ingestion points: Files in src/pipelines/ read via Read, Grep, and Glob. 2. Boundary markers: Absent. The agent is not instructed to use delimiters or ignore instructions within the analyzed code. 3. Capability inventory: Bash access, including the ability to run system commands and benchmarks. 4. Sanitization: Absent. There is no validation or filtering of the content ingested from the files.
  • [EXTERNAL_DOWNLOADS] (LOW): The use of uv run may trigger automatic package downloads and environment setup if the repository contains specific configuration files (e.g., pyproject.toml), representing a dependency risk.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 12:49 PM