symbolic-equation

Warn

Audited by Gen Agent Trust Hub on Feb 20, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION] (MEDIUM): The skill's core workflow involves the execution of Python code generated by an LLM to evaluate mathematical equations. This constitutes dynamic code execution.
  • Evidence: The file references/llmsr-patterns.md contains the evaluator.py logic which calls self._sandbox.run(program, ...) where program is the code string returned by the LLM.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): The evolutionary search loop (sampler.py) continuously pulls 'samples' (code) from an LLM and passes them to an evaluator for execution. This creates a pipeline where untrusted remote data (LLM output) is treated as executable logic.
  • Evidence: sampler.py calls self._llm.draw_samples(prompt.code, self.config) and then immediately passes the resulting samples to evaluator.analyse() for execution.
  • [INDIRECT_PROMPT_INJECTION] (LOW): The skill processes user-provided dataset descriptions and physical context ($0), which are interpolated into prompts for equation generation. This provides a surface for indirect prompt injection.
  • Ingestion points: $0 in SKILL.md is used as dataset description and physical context for the LLM.
  • Boundary markers: The skill uses basic versioning sequences (equation_v0, equation_v1) to delimit previous attempts but lacks strong structural isolation for the user-provided context.
  • Capability inventory: The system can execute arbitrary Python math logic (restricted by a sandbox of unknown strength) and perform parameter optimization using scipy.
  • Sanitization: The skill documentation mentions "Timeout protection" and a restriction against "recursive equations," which are positive but limited mitigations.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 20, 2026, 05:22 AM