The Agent Skills Directory

[REMOTE_CODE_EXECUTION]: The skill recommends using npx promptfoo@latest to execute the evaluation framework directly from the package registry.
[COMMAND_EXECUTION]: Support for custom javascript and python assertion types allows running arbitrary code on the host system to validate model outputs.
[DATA_EXFILTRATION]: Documents the --share flag which uploads evaluation results to the promptfoo.dev cloud service for sharing, which may include sensitive prompt or output data.
[PROMPT_INJECTION]: The skill identifies an indirect prompt injection attack surface where untrusted data from LLM outputs is processed by user-defined code. Ingestion points: model output in javascript or python assertions (e.g., examples/model-graded.md). Boundary markers: None. Capability inventory: Execution of local scripts via assertions. Sanitization: No built-in sanitization for output data before processing by custom logic.
[EXTERNAL_DOWNLOADS]: Fetches the promptfoo package from the npm registry during execution and uses GitHub Actions for CI/CD integration.

ai-observability-promptfoo