behavior-preservation-checker

Warn

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/behavior_checker.py utilizes subprocess.run to execute pytest and unittest commands. The working directory is set to user-provided paths (repo_path), which leads to the execution of any code present in the repository's test files or configuration scripts (e.g., conftest.py).
  • [REMOTE_CODE_EXECUTION]: The script scripts/trace_execution.py uses importlib.util.spec_from_file_location and spec.loader.exec_module to dynamically load and run Python code from user-specified file paths. It then proceeds to execute functions defined in those modules with parameters provided via JSON files, allowing for arbitrary local code execution.
  • [PROMPT_INJECTION]: This skill is vulnerable to indirect prompt injection via the data it processes. Ingestion points: scripts/behavior_checker.py (parsing test results from JSON) and scripts/trace_execution.py (parsing trace input JSON). Boundary markers: None. Capability inventory: Execution of shell commands and dynamic Python module loading. Sanitization: None.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 6, 2026, 10:20 PM