symbolic-equation

Warn

Audited by Gen Agent Trust Hub on Feb 22, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION] (MEDIUM): The skill dynamically generates and executes Python code based on LLM outputs as part of its core search process. While it mentions a sandbox, the execution of arbitrary Python strings in evaluator.py via self._sandbox.run poses a risk of unauthorized command execution.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): The evolutionary loop involves the LLM proposing new code variants which are immediately compiled and evaluated. This creates a direct path for the LLM to execute code on the host environment. Evidence is found in SKILL.md Step 3 and the analyse function in references/llmsr-patterns.md.
  • [INDIRECT_PROMPT_INJECTION] (LOW): The skill is susceptible to indirect prompt injection through its data input field. 1. Ingestion points: SKILL.md Input $0 (Dataset description and physical context). 2. Boundary markers: Absent; the physical context is directly interpolated into the LLM prompt. 3. Capability inventory: Significant capability to execute Python code via evaluator.py. 4. Sanitization: Mentions timeouts but lacks filtering for dangerous modules like os or subprocess within the generated equation code.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 22, 2026, 05:00 AM