paper-banana

Warn

Audited by Gen Agent Trust Hub on Feb 18, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • Prompt Injection (LOW): The scripts retriever.py, planner.py, and critic.py interpolate user-provided methodology text and captions directly into LLM prompts. While delimiters are used, there is no explicit instruction to ignore embedded commands within the input data.
  • Indirect Prompt Injection (LOW):
  • Ingestion points: The methodology and caption arguments in scripts/retriever.py, scripts/planner.py, and scripts/critic.py ingest untrusted text.
  • Boundary markers: The prompts use delimiters like --- USER METHODOLOGY TEXT ---, which provide some separation but are insufficient against adversarial pressure.
  • Capability inventory: The skill performs file writes (scripts/generate_image.py), makes external API calls to Google GenAI, and is designed to execute generated Python code in 'Plot Mode'.
  • Sanitization: No sanitization or escaping of the input methodology text is performed before it is interpolated into the system prompts.
  • Remote Code Execution / Dynamic Execution (MEDIUM): The documentation in references/PLOT-PROMPTS.md explicitly defines a workflow for 'Phase 4: Visualizer' where the agent is instructed to generate complete Python code (using matplotlib, seaborn, etc.) and 'Then execute it.' Generating and running code based on user-influenced data is a high-risk pattern that can lead to arbitrary code execution if the LLM is successfully injected.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 18, 2026, 06:19 PM