figure-generation

Fail

Audited by Gen Agent Trust Hub on Feb 20, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill's Phase 2 workflow creates a temporary file 'figure_script.py' from LLM-generated content and executes it using the 'python' interpreter. This direct execution of generated scripts poses a severe risk of system compromise. Evidence: SKILL.md Phase 2.
  • REMOTE_CODE_EXECUTION (HIGH): The skill synthesizes executable logic from external user-provided strings ({query}). The internal prompts in references/figure-prompts.md explicitly instruct the LLM that it can 'use any python library you want', which removes safeguards and facilitates arbitrary system access.
  • PROMPT_INJECTION (LOW): The skill is susceptible to indirect prompt injection through the {query} parameter.
  • Ingestion points: The user-provided query is directly interpolated into prompts for query expansion and code generation in references/figure-prompts.md.
  • Boundary markers: The prompts use triple quotes for delimitation, which can be easily bypassed by an adversary.
  • Capability inventory: The skill allows for arbitrary Python execution, filesystem reads of various data formats (CSV, PKL, NPY), and subprocess calls.
  • Sanitization: No sanitization or safety checks are performed on the generated code prior to execution.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 20, 2026, 05:23 AM