skills/jamesrobmccall/modal_skills/modal-batch-processing

modal-batch-processing

SKILL.md

Modal Batch Processing

Quick Start

  1. Verify the actual local Modal environment before writing code.
modal --version
python -c "import modal,sys; print(modal.__version__); print(sys.executable)"
modal profile current
  • Do not assume the default python interpreter matches the environment behind the modal CLI.
  • Switch to the project virtualenv or the interpreter behind the installed modal CLI before writing examples or running scripts.
  • Use with modal.enable_output(): around with app.run(): when local provisioning logs or remote prints are needed for debugging.
  1. Classify the request before writing code.
  • Caller waits and needs the results back in the same process: use .map or .starmap.
  • Caller should return immediately and poll later: use .spawn.
  • Detached fan-out writes results somewhere durable: use .spawn_map.
  • Many small homogeneous requests should share one execution: use @modal.batched.
  1. Read exactly one primary reference before drafting code.

Choose the Workflow

  • Use .map or .starmap when the caller can wait for the full fan-out to finish and the results must come back to the same local process. Read references/map-and-gather.md.
  • Use .spawn when the caller should return immediately and keep a stable FunctionCall handle or job ID for later polling and collection. Deploy the function first if another service submits the work. Read references/job-queues-and-detached-runs.md.
  • Use .spawn_map only when each detached task writes its own durable output to a Volume, CloudBucketMount, database, or another external sink. Do not choose it when the caller expects programmatic result retrieval later. Read references/job-queues-and-detached-runs.md.
  • Use @modal.batched when many individual requests can be coalesced into fewer container or GPU executions. Keep the function contract list-in and list-out. Read references/dynamic-batching.md.

Default Rules

  • Start with plain @app.function functions for stateless work. Move to @app.cls only when the container must reuse loaded state or expensive initialization.
  • Keep orchestration local with @app.local_entrypoint or a plain Python script plus with app.run(): when the entire workflow can stay within one session.
  • Deploy with modal deploy and use modal.Function.from_name(...) when another service must submit jobs or look up a stable remote function later.
  • Set timeout= intentionally on remote work. Add retries= only when the work is idempotent and safe to re-run.
  • Set max_containers= when upstream systems, GPU quotas, or external APIs need a hard concurrency cap.
  • Persist outputs externally whenever detached work may outlive the caller or when using .spawn_map.
  • Use Volumes or CloudBucketMounts for durable caches, model weights, and shared intermediates; do not rely on ephemeral container disk.
  • Prefer .map or .starmap over .spawn when the caller genuinely needs results immediately and no durable job handle is required.
  • Prefer .spawn over .map when the caller needs a stable job ID or should return before the remote work finishes.
  • Treat .spawn_map() as detached fire-and-forget in Modal 1.3.4. The installed SDK docstring says programmatic result retrieval is not supported, so only use it when each task writes its output elsewhere.
  • If the task is really about OpenAI-compatible vLLM or SGLang serving, stop and use modal-llm-serving.
  • If the task is really about training model weights, stop and use modal-finetuning.
  • If the task is really about isolated interactive execution, tunnels, or sandbox restore flows, stop and use modal-sandbox.

Validate

  • Run npx skills add . --list after editing the package metadata or skill descriptions.
  • Keep evals/evals.json and evals/trigger-evals.json aligned with the actual workflow boundary of the skill.
  • Run scripts/smoke_test.py with a Python interpreter that can import modal when changing the workflow guidance or runnable artifact.

References

Weekly Installs
4
GitHub Stars
2
First Seen
8 days ago
Installed on
claude-code4
opencode3
gemini-cli3
github-copilot3
codex3
kimi-cli3