eval-driven-dev
Pass
Audited by Gen Agent Trust Hub on Mar 16, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: Executes shell commands for package installation (
pip install pixie-qa) and running tests (pixie test). - [EXTERNAL_DOWNLOADS]: Fetches the 'pixie-qa' package from the Python Package Index (PyPI) during the setup phase.
- [DATA_EXFILTRATION]: Accesses environment variables for sensitive API keys (
OPENAI_API_KEY,ANTHROPIC_API_KEY) to verify the environment configuration. - [PROMPT_INJECTION]: Vulnerable to indirect prompt injection through the ingestion of untrusted application outputs for evaluation. Ingestion points:
pixie_qa/datasets/containingeval_output. Boundary markers: None. Capability inventory:pixie testexecutes Python test files. Sanitization: None. - [COMMAND_EXECUTION]: Dynamically generates and executes Python scripts (
build_dataset.py,test_*.py) for the evaluation workflow.
Audit Metadata