promptfoo-evaluation
Warn
Audited by Gen Agent Trust Hub on Feb 23, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- Dynamic Execution (MEDIUM): The skill demonstrates how to configure Promptfoo to load and execute custom Python scripts from the local filesystem for assertions (e.g.,
file://scripts/metrics.py:custom_assert). This allows for arbitrary code execution in the evaluation environment.\n- External Downloads & Remote Code Execution (LOW): The instructions suggest usingnpx promptfoo@latest, which downloads and executes the package directly from the npm registry.\n- Data Exposure (LOW): Documentation inSKILL.mdreveals a specific local path (/Users/tiansheng/Workspace/prompts/tiaogaoren/), exposing a system username.\n- Indirect Prompt Injection (LOW): The skill identifies a pattern where data is loaded from files and interpolated into LLM prompts without explicit sanitization or boundary markers.\n - Ingestion points:
promptfooconfig.yamlandtests/cases.yamluse thefile://scheme to import external content.\n - Boundary markers: Absent in the provided prompt templates.\n
- Capability inventory: Promptfoo executes shell commands (npx), runs Python scripts, and makes network requests to external providers.\n
- Sanitization: Absent.
Audit Metadata