paper-banana
Warn
Audited by Gen Agent Trust Hub on Feb 18, 2026
Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- Prompt Injection (LOW): The scripts
retriever.py,planner.py, andcritic.pyinterpolate user-provided methodology text and captions directly into LLM prompts. While delimiters are used, there is no explicit instruction to ignore embedded commands within the input data. - Indirect Prompt Injection (LOW):
- Ingestion points: The
methodologyandcaptionarguments inscripts/retriever.py,scripts/planner.py, andscripts/critic.pyingest untrusted text. - Boundary markers: The prompts use delimiters like
--- USER METHODOLOGY TEXT ---, which provide some separation but are insufficient against adversarial pressure. - Capability inventory: The skill performs file writes (
scripts/generate_image.py), makes external API calls to Google GenAI, and is designed to execute generated Python code in 'Plot Mode'. - Sanitization: No sanitization or escaping of the input methodology text is performed before it is interpolated into the system prompts.
- Remote Code Execution / Dynamic Execution (MEDIUM): The documentation in
references/PLOT-PROMPTS.mdexplicitly defines a workflow for 'Phase 4: Visualizer' where the agent is instructed to generate complete Python code (using matplotlib, seaborn, etc.) and 'Then execute it.' Generating and running code based on user-influenced data is a high-risk pattern that can lead to arbitrary code execution if the LLM is successfully injected.
Audit Metadata