improvement-discriminator

Warn

Audited by Gen Agent Trust Hub on Apr 8, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [DYNAMIC_EXECUTION]: The RealSkillEvaluator class in interfaces/critic_engine.py dynamically loads and executes Python modules from a provided file path using importlib.util.spec_from_file_location and spec.loader.exec_module. This allows for the execution of arbitrary code found within the target directory being evaluated. While intended for loading custom evaluation logic, this pattern presents a risk if the agent is directed to evaluate untrusted or malicious repositories.
  • [INDIRECT_PROMPT_INJECTION]: The LLMJudge class in interfaces/llm_judge.py interpolates untrusted content from improvement candidates (specifically the proposed_content and description fields) directly into the LLM prompt template without sanitization. An attacker could craft a candidate that contains hidden instructions to manipulate the judge's scoring or decision-making process.
  • Ingestion points: proposed_content and description derived from the candidates.json input file in interfaces/llm_judge.py.
  • Boundary markers: Uses markdown headers (## Proposed Change), but lacks explicit instructions to the LLM to ignore or disregard embedded instructions within the user-provided content.
  • Capability inventory: The output of the LLMJudge (verdict and score) determines the recommendation status (e.g., accept_for_execution), which is used by downstream executor skills to apply changes to the file system.
  • Sanitization: No escaping or validation is performed on the candidate content before it is interpolated into the prompt string.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 8, 2026, 03:24 AM