motherduck-ducklake
Use DuckLake on MotherDuck
Use this skill when the storage decision is genuinely about open table format and object-store behavior, not just about where to put another analytical table.
Source Of Truth
- Prefer current MotherDuck DuckLake docs first.
- Use the upstream DuckLake and DuckDB extension docs only to clarify extension-level behavior that MotherDuck docs reference.
- Keep the guidance aligned with the documented product posture:
- native MotherDuck first
- upstream DuckLake v1.0 is production-ready and supported by DuckDB 1.5.2, while MotherDuck's DuckLake docs still define the MotherDuck product surface and preview/compatibility limits
- fully managed, BYOB, and own-compute paths are distinct
- maintenance and compaction are explicit operations, not background magic
Default Posture
- Start with native MotherDuck storage unless there is a concrete DuckLake requirement.
- Reach for DuckLake when you need open-table-format semantics, object storage as the source of truth, BYOB, or file-aware maintenance.
- Do not recommend DuckLake just because a workload is "large"; MotherDuck's docs explicitly note native storage is often faster for reads.
- Choose the operating mode deliberately: fully managed for easiest evaluation, BYOB for customer bucket ownership, own compute only when the compute boundary matters too.
- Document the fallback to native MotherDuck storage if the DuckLake requirement is weak, unverified, or only about future portability.
- For DuckLake v1.0, data inlining, sorted tables, bucket partitioning, deletion vectors, or extension behavior, verify the current MotherDuck DuckLake docs and DuckDB/DuckLake version matrix before giving syntax guarantees.
- Do not infer MotherDuck client/runtime support from upstream DuckDB release notes alone; check the MotherDuck lifecycle docs when the exact DuckDB version matters.
- Keep the MotherDuck product surface separate from raw DuckLake-extension assumptions.
Workflow
- Confirm why native MotherDuck storage is insufficient.
- Pick the operating mode: fully managed, BYOB with MotherDuck compute, or BYOB with own compute.
- Verify regional and bucket constraints before proposing BYOB.
- Define the ingestion and maintenance posture up front, including data inlining, file compaction, and cleanup expectations.
- Validate who will query the data and from which compute surface before finalizing the architecture.
Open Next
references/DUCKLAKE_PLAYBOOK.mdfor the mode decision matrix, MotherDuck-specific SQL patterns, BYOB constraints, data-inlining behavior, maintenance functions, and common DuckLake mistakes
Related Skills
motherduck-connectfor choosing native DuckDB versus Postgres-endpoint access pathsmotherduck-load-datawhen the real issue is ingestion rather than storage formatmotherduck-model-datawhen the user still needs analytical table design after the storage decisionmotherduck-build-data-pipelinewhen DuckLake is just one part of a broader ingestion-to-serving workflow
More from motherduckdb/agent-skills
motherduck-security-governance
Explain MotherDuck security, governance, and access-control patterns. Use when a security_compliance_owner, technical_owner, or application_builder is asking about residency, access boundaries, service accounts, isolation, sharing, or governance posture.
47motherduck-query
Execute DuckDB SQL queries against MotherDuck databases. Use when running analytics, aggregations, transformations, or any SQL operation. Covers query best practices, CTEs, window functions, QUALIFY, and performance optimization.
43motherduck-build-data-pipeline
Design an end-to-end MotherDuck pipeline. Use when choosing raw, staging, and analytics boundaries, bulk ingestion paths, transformation sequencing, publication targets, or whether DuckLake is actually required.
43motherduck-pricing-roi
Explain MotherDuck pricing and ROI tradeoffs. Use when an economic_buyer, technical_owner, or analytics_lead is asking about spend, budget guardrails, workload cost drivers, plan fit, or whether MotherDuck is worth adopting.
43motherduck-create-dive
Create, edit, manage, share, or embed MotherDuck Dives. Use when the work involves Dive authoring, live React + SQL components, MCP get_dive_guide, useSQLQuery, local preview, version history, Dives-as-code, required resources, team sharing, or embedded Dive sessions.
43motherduck-share-data
Create and manage MotherDuck data shares for zero-copy data distribution. Use when sharing databases with team members, other organizations, or making data publicly available.
43