paperbanana

Use this skill for PaperBanana operations in repo:

$PAPERBANANA_HOME  # default: ~/dev/paperbanana

Core Tasks

Generate methodology diagrams from context text + caption
Generate statistical plots from CSV/JSON + intent
Evaluate generated diagrams vs reference images
Run MCP server for agent tool integration
Diagnose provider/key/quota/timeout failures

Preflight Checklist

Run these checks before generation:

paperbanana --help
paperbanana-mcp --help

echo "OPENROUTER_API_KEY set?"; [ -n "$OPENROUTER_API_KEY" ] && echo yes || echo no

Run scripts/probe_openrouter_models.py in this skill to verify model availability before long runs.

Recommended Workflow

Provider: OpenRouter ($OPENROUTER_API_KEY)

VLM planning: google/gemini-3-flash-preview
Image generation: google/gemini-3.1-flash-image-preview

Run a minimal smoke test first

paperbanana generate \
  --input examples/sample_inputs/transformer_method.txt \
  --caption "Overview of a transformer architecture" \
  --iterations 1

Run full generation after smoke test passes
Validate output directory (outputs/run_*/) and final_output.*

Language Policy (Default Chinese)

Unless the user explicitly requests another language, all visible chart/diagram text must be Simplified Chinese.
Keep technical identifiers unchanged when needed for accuracy (API paths, table names, error codes, model names).
Put a clear language constraint directly in --input context and --caption, e.g.:
- 语言要求：图中所有可见文本必须为简体中文（必要的代码标识除外）。
If generated output still contains English labels, run paperbanana generate --continue-run <run_id> --feedback "<中文化改图要求>" --iterations 1 to force localization.

Command Templates

Generate Diagram

paperbanana generate \
  --input <method.txt> \
  --caption "<figure caption>" \
  --vlm-provider openrouter \
  --vlm-model google/gemini-3-flash-preview \
  --image-provider openrouter_imagen \
  --image-model google/gemini-3.1-flash-image-preview \
  --iterations <n>

Chinese-first template:

paperbanana generate \
  --input <method_zh.txt> \
  --caption "<中文图题，明确要求图中文字使用简体中文>" \
  --vlm-provider openrouter \
  --vlm-model google/gemini-3-flash-preview \
  --image-provider openrouter_imagen \
  --image-model google/gemini-3.1-flash-image-preview \
  --iterations 1

Continue Existing Run

paperbanana generate --continue --iterations 2
# or
paperbanana generate --continue-run <run_id> --feedback "<change request>" --iterations 2

Generate Plot

paperbanana plot \
  --data <data.csv|data.json> \
  --intent "<plot intent>" \
  --iterations 1

Chinese-first intent example:

paperbanana plot \
  --data <data.csv> \
  --intent "请生成简体中文统计图：标题、坐标轴、图例、注释均使用中文（变量名可保留英文）" \
  --iterations 1

Evaluate Diagram

paperbanana evaluate \
  --generated <generated.png> \
  --reference <reference.png> \
  --context <method.txt> \
  --caption "<figure caption>"

MCP Server

paperbanana-mcp

Troubleshooting Rules

429 or rate-limit errors:
- Check OpenRouter credits/quota at https://openrouter.ai/credits
- Switch to another available model
- Retry after quota window reset
Long runtime without stdout is common on large multimodal steps:
- Planner and Stylist can take >1 minute
- Visualizer image generation can take >2 minutes
- Check outputs/<run_id>/run_input.json to confirm run started
If one provider path is unstable, switch provider/model pair and rerun smoke test
Prefer small context + --iterations 1 for first pass

For detailed diagnostics and known failure patterns, read:

references/troubleshooting.md

For model availability probes, run:

scripts/probe_openrouter_models.py