apple-foundation-models
Apple Foundation Models and Image Playground
Implement reliable Apple Intelligence features with Apple's on-device text and image generation APIs across all supported platforms.
Model Limitations (CRITICAL — Read First)
The on-device model is ~3B parameters with a 4,096-token context window. It excels at summarization, extraction, classification, tagging, composition, and revision. It cannot reliably do:
- World knowledge / factual Q&A — will hallucinate
- Math and arithmetic — unreliable at multi-step calculations
- Code generation — cannot produce correct code
- Long-form writing (>200 words) — context window too small
Before writing any code, check references/model-capabilities-and-limits.md to confirm the task is within the model's capabilities. If it is not, design an escalation path (tool, backend model, or user disclosure).
Quick Reference
| Capability | Key API | Reference |
|---|---|---|
| Text generation (basic) | LanguageModelSession.respond(to:) |
references/foundation-models-framework.md |
| Streaming output | session.streamResponse(to:) |
references/foundation-models-framework.md |
| Structured output | @Generable, @Guide, session.respond(to:generating:) |
references/generable-and-guided-generation.md |
| Tool calling | Tool protocol, session with tools: |
references/tool-calling.md |
| Prompt and instruction design | LanguageModelSession(instructions:) |
references/prompt-design-and-safety.md |
| Safety and error handling | guardrailViolation, input sanitization |
references/prompt-design-and-safety.md |
| Model capabilities check | Capability/limitation tables | references/model-capabilities-and-limits.md |
| Image generation | ImagePlayground, ImageCreator |
references/image-playground.md |
| Testing and debugging | #Playground, session.transcript, Instruments |
references/foundation-models-framework.md |
| Local vs cloud routing | GenerationPath enum pattern |
references/routing-local-vs-bigger-model.md |
Workflow
- Check model limitations first. Verify the task is within the on-device model's capabilities using the tables in
references/model-capabilities-and-limits.md. If it is not, design an escalation path before writing code. - Classify the request. Use Foundation Models for on-device text generation, ImagePlayground for image generation, and App Intents/Shortcuts for the "Use Model" automation action.
- Check platform support and runtime readiness. Use
isAvailableandavailabilitystates, then design fallback UI for unavailable states. - Design instructions and prompts. Set behavioral constraints in
instructions:(developer-only). Keep prompts concise with length qualifiers. Seereferences/prompt-design-and-safety.md. - Design
@Generabletypes for structured output. Use@Generablewith@Guideconstraints instead of asking the model to produce JSON. Seereferences/generable-and-guided-generation.md. - Stream all user-facing generation. Use
streamResponse(to:)instead ofrespond(to:)for any output the user sees. See the streaming section inreferences/foundation-models-framework.md. - Add tools only as needed. Register tools when the model needs data or actions it cannot perform alone. See
references/tool-calling.md. - Decide whether to route to a larger model. For Apple-managed routing, use App Intents "Use Model." For in-app escalation, use an explicit backend model path. See
references/routing-local-vs-bigger-model.md. - Validate behavior on physical devices and include robust error handling for guardrail violations, context size, language support, and tool failures.
References
- Framework core (sessions, availability, streaming, performance):
references/foundation-models-framework.md - Model capabilities and limitations:
references/model-capabilities-and-limits.md - Structured output with @Generable and @Guide:
references/generable-and-guided-generation.md - Tool calling (Tool protocol, dynamic tools, tool graphs):
references/tool-calling.md - Prompt design, safety, and error handling:
references/prompt-design-and-safety.md - Image generation (ImagePlayground, ImageCreator):
references/image-playground.md - Local vs larger-model routing strategy:
references/routing-local-vs-bigger-model.md
Execution Rules
- Prefer official Apple docs and WWDC sources for API behavior.
- Treat Foundation Models app APIs as on-device unless Apple docs explicitly document a server-routing API for app code.
- Re-verify docs for "latest" requests because Apple Intelligence behavior can change across OS releases.
- Keep prompts concise and structured to reduce token use and latency.
- Check model limitations before implementing. If the task involves world knowledge, math, code generation, or long-form writing, design an escalation path — do not rely on the on-device model.
- Stream all user-facing generation. Use
streamResponse(to:)for any output displayed to the user. Reserverespond(to:)for background processing. - Use
@Generablefor structured output, not JSON in prompts. The model uses constrained decoding with@Generabletypes, guaranteeing valid output. Prompt-based JSON is unreliable.
More from jakerains/agentskills
shot-list
Generate professional shot lists from screenplays and scripts. Use when user uploads a screenplay (.fountain, .fdx, .txt, .pdf, .docx) or describes scenes for production planning. Parses scripts to extract scenes, helps determine camera setups, shot types, framing, and movement through collaborative discussion, then generates beautifully formatted PDF shot lists for production. Triggers include requests to create shot lists, plan shots, break down scripts for filming, or organize camera coverage.
27nextjs-pwa
Build Progressive Web Apps with Next.js: service workers, offline support, caching strategies, push notifications, install prompts, and web app manifest. Use when creating PWAs, adding offline capability, configuring service workers, implementing push notifications, handling install prompts, or optimizing PWA performance. Triggers: PWA, progressive web app, service worker, offline, cache strategy, web manifest, push notification, installable app, Serwist, next-pwa, workbox, background sync.
9elevenlabs
Complete ElevenLabs AI audio platform: text-to-speech (TTS), speech-to-text (STT/Scribe), voice cloning, voice design, sound effects, music generation, dubbing, voice changer, voice isolator, and conversational voice agents. Use when working with audio generation, voice synthesis, transcription, audio processing, or building voice-enabled applications. Triggers: generate speech, clone voice, transcribe audio, create sound effects, compose music, dub video, change voice, isolate vocals, build voice agent, ElevenLabs API/SDK/CLI/MCP.
9onnx-webgpu-converter
Convert HuggingFace transformer models to ONNX format for browser inference with Transformers.js and WebGPU. Use when given a HuggingFace model link to convert to ONNX, when setting up optimum-cli for ONNX export, when quantizing models (fp16, q8, q4) for web deployment, when configuring Transformers.js with WebGPU acceleration, or when troubleshooting ONNX conversion errors. Triggers on mentions of ONNX conversion, Transformers.js, WebGPU inference, optimum export, model quantization for browser, or running ML models in the browser.
8skill-seekers
Convert documentation websites, GitHub repositories, and PDFs into Claude AI skills. Use when creating Claude skills from docs, scraping documentation, packaging websites into skills, or converting repos/PDFs to Claude knowledge.
7vercel-workflow
Build durable workflows with Vercel Workflow DevKit using "use workflow" and "use step" directives. Use for long-running tasks, background jobs, AI agents, webhooks, scheduled tasks, retries, and workflow orchestration. Supports Next.js, Vite, Astro, Express, Fastify, Hono, Nitro, Nuxt, SvelteKit.
7