eval-driven-dev

Warn

Audited by Gen Agent Trust Hub on Apr 28, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The resources/setup.sh script performs automated installation of the pixie-qa Python package from PyPI using package managers like pip, poetry, or uv. It also executes npx skills update to fetch updates for the skill environment.
  • [COMMAND_EXECUTION]: The skill executes various system-level commands to initialize the project, run tests, and manage lifecycle events. This includes launching a background web server process via pixie start and stopping it via pixie stop.
  • [REMOTE_CODE_EXECUTION]: The primary workflow requires the agent to implement an AppRunnable class that dynamically imports and executes the user's application code. While intended for testing, this pattern involves the runtime execution of arbitrary code within the agent's context.
  • [DATA_EXFILTRATION]: The skill uses instrumentation (pixie.wrap) to capture data from the application's boundaries, including user inputs, database records, and API responses. This data is logged to local trace files and exposed via a web dashboard, creating a mechanism for harvesting sensitive information from the application environment.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 28, 2026, 04:26 AM