modal-batch-processing
SKILL.md
Modal Batch Processing
Quick Start
- Verify the actual local Modal environment before writing code.
modal --version
python -c "import modal,sys; print(modal.__version__); print(sys.executable)"
modal profile current
- Do not assume the default
pythoninterpreter matches the environment behind themodalCLI. - Switch to the project virtualenv or the interpreter behind the installed
modalCLI before writing examples or running scripts. - Use
with modal.enable_output():aroundwith app.run():when local provisioning logs or remote prints are needed for debugging.
- Classify the request before writing code.
- Caller waits and needs the results back in the same process: use
.mapor.starmap. - Caller should return immediately and poll later: use
.spawn. - Detached fan-out writes results somewhere durable: use
.spawn_map. - Many small homogeneous requests should share one execution: use
@modal.batched.
- Read exactly one primary reference before drafting code.
.mapor.starmap: references/map-and-gather.md.spawnor.spawn_map: references/job-queues-and-detached-runs.md@modal.batched: references/dynamic-batching.md
Choose the Workflow
- Use
.mapor.starmapwhen the caller can wait for the full fan-out to finish and the results must come back to the same local process. Read references/map-and-gather.md. - Use
.spawnwhen the caller should return immediately and keep a stableFunctionCallhandle or job ID for later polling and collection. Deploy the function first if another service submits the work. Read references/job-queues-and-detached-runs.md. - Use
.spawn_maponly when each detached task writes its own durable output to a Volume, CloudBucketMount, database, or another external sink. Do not choose it when the caller expects programmatic result retrieval later. Read references/job-queues-and-detached-runs.md. - Use
@modal.batchedwhen many individual requests can be coalesced into fewer container or GPU executions. Keep the function contract list-in and list-out. Read references/dynamic-batching.md.
Default Rules
- Start with plain
@app.functionfunctions for stateless work. Move to@app.clsonly when the container must reuse loaded state or expensive initialization. - Keep orchestration local with
@app.local_entrypointor a plain Python script pluswith app.run():when the entire workflow can stay within one session. - Deploy with
modal deployand usemodal.Function.from_name(...)when another service must submit jobs or look up a stable remote function later. - Set
timeout=intentionally on remote work. Addretries=only when the work is idempotent and safe to re-run. - Set
max_containers=when upstream systems, GPU quotas, or external APIs need a hard concurrency cap. - Persist outputs externally whenever detached work may outlive the caller or when using
.spawn_map. - Use Volumes or CloudBucketMounts for durable caches, model weights, and shared intermediates; do not rely on ephemeral container disk.
- Prefer
.mapor.starmapover.spawnwhen the caller genuinely needs results immediately and no durable job handle is required. - Prefer
.spawnover.mapwhen the caller needs a stable job ID or should return before the remote work finishes. - Treat
.spawn_map()as detached fire-and-forget in Modal 1.3.4. The installed SDK docstring says programmatic result retrieval is not supported, so only use it when each task writes its output elsewhere. - If the task is really about OpenAI-compatible vLLM or SGLang serving, stop and use
modal-llm-serving. - If the task is really about training model weights, stop and use
modal-finetuning. - If the task is really about isolated interactive execution, tunnels, or sandbox restore flows, stop and use
modal-sandbox.
Validate
- Run
npx skills add . --listafter editing the package metadata or skill descriptions. - Keep
evals/evals.jsonandevals/trigger-evals.jsonaligned with the actual workflow boundary of the skill. - Run scripts/smoke_test.py with a Python interpreter that can import
modalwhen changing the workflow guidance or runnable artifact.
References
- Read references/map-and-gather.md for synchronous fan-out and in-process result collection.
- Read references/job-queues-and-detached-runs.md for deployed job queues, job IDs, detached runs, and external result sinks.
- Read references/dynamic-batching.md for
@modal.batchedcontracts and tuning.
Weekly Installs
4
Repository
jamesrobmccall/…l_skillsGitHub Stars
2
First Seen
8 days ago
Security Audits
Installed on
claude-code4
opencode3
gemini-cli3
github-copilot3
codex3
kimi-cli3