improvement-learner

Pass

Audited by Gen Agent Trust Hub on Apr 8, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/self_improve.py executes several external commands using subprocess.run, including the claude CLI for evaluation, pytest for reliability testing, and git for version control. While these calls use argument lists (avoiding shell=True), they operate on external file paths and content provided at runtime, which is a common pattern for development tools but requires trust in the input directory.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it reads the content of external SKILL.md files and interpolates them directly into the _ACCURACY_JUDGE_PROMPT used by the LLM judge. A malicious file could include instructions designed to deceive the LLM into providing a high score or bypassing evaluation criteria.
  • Ingestion points: Files are read from the path provided to the --skill-path argument in scripts/self_improve.py.
  • Boundary markers: The prompt uses --- delimiters around the skill content, but lacks specific instructions to the model to ignore any instructions found within that content.
  • Capability inventory: The tool possesses the capability to overwrite local files, delete directories (backups), and execute shell commands.
  • Sanitization: The content is truncated to 8000 characters before being processed, but no sanitization or escaping of the markdown text is performed.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 8, 2026, 03:24 AM