NYC

promptfoo-evaluation

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (MEDIUM): The skill instructs the agent to run shell commands via npx promptfoo@latest, which downloads and executes external code on the host machine. It also heavily promotes the execution of local Python scripts using the file:// protocol for custom assertions (e.g., scripts/metrics.py).
  • DATA_EXFILTRATION (LOW): Information Exposure. The SKILL.md file contains a hardcoded absolute path (/Users/tiansheng/Workspace/prompts/tiaogaoren/) which exposes the author's local username and internal directory structure to the agent.
  • EXTERNAL_DOWNLOADS (LOW): The skill utilizes npx to fetch the promptfoo package from the npm registry at runtime, introducing a dependency on an external repository.
  • PROMPT_INJECTION (LOW): Indirect Prompt Injection Surface. The skill is designed to ingest and process untrusted external data within LLM prompts using template variables like {{user_input}}. Evidence:
  • Ingestion points: SKILL.md (Few-Shot Pattern) and chat.json.
  • Boundary markers: None; variables are interpolated directly into prompts.
  • Capability inventory: Shell command execution, local file system read/write via configuration, and Python script execution.
  • Sanitization: No evidence of input validation or escaping for the template variables.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 05:10 PM