figure-generation
Fail
Audited by Gen Agent Trust Hub on Feb 20, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The skill's Phase 2 workflow creates a temporary file 'figure_script.py' from LLM-generated content and executes it using the 'python' interpreter. This direct execution of generated scripts poses a severe risk of system compromise. Evidence: SKILL.md Phase 2.
- REMOTE_CODE_EXECUTION (HIGH): The skill synthesizes executable logic from external user-provided strings ({query}). The internal prompts in references/figure-prompts.md explicitly instruct the LLM that it can 'use any python library you want', which removes safeguards and facilitates arbitrary system access.
- PROMPT_INJECTION (LOW): The skill is susceptible to indirect prompt injection through the {query} parameter.
- Ingestion points: The user-provided query is directly interpolated into prompts for query expansion and code generation in references/figure-prompts.md.
- Boundary markers: The prompts use triple quotes for delimitation, which can be easily bypassed by an adversary.
- Capability inventory: The skill allows for arbitrary Python execution, filesystem reads of various data formats (CSV, PKL, NPY), and subprocess calls.
- Sanitization: No sanitization or safety checks are performed on the generated code prior to execution.
Recommendations
- AI detected serious security threats
Audit Metadata