mongodb-schema-design
MongoDB Schema Design
Data modeling patterns and anti-patterns for MongoDB, maintained by MongoDB. Contains 33 rules across 5 categories, prioritized by impact. Bad schema is the root cause of most MongoDB performance and cost issues—queries and indexes cannot fix a fundamentally wrong model.
When to Apply
Reference these guidelines when:
- Designing a new MongoDB schema from scratch
- Migrating from SQL/relational databases to MongoDB
- Reviewing existing data models for performance issues
- Troubleshooting slow queries or growing document sizes
- Deciding between embedding and referencing
- Modeling relationships (one-to-one, one-to-many, many-to-many)
- Implementing tree/hierarchical structures
- Seeing Atlas Schema Suggestions or Performance Advisor warnings
- Hitting the 16MB document limit
- Adding schema validation to existing collections
Rule Categories by Priority
| Priority | Category | Impact | Prefix | Rules |
|---|---|---|---|---|
| 1 | Schema Anti-Patterns | CRITICAL | antipattern- |
7 |
| 2 | Schema Fundamentals | HIGH | fundamental- |
5 |
| 3 | Relationship Patterns | HIGH | relationship- |
6 |
| 4 | Design Patterns | MEDIUM | pattern- |
12 |
| 5 | Schema Validation | MEDIUM | validation- |
3 |
Quick Reference
1. Schema Anti-Patterns (CRITICAL) - 7 rules
antipattern-unbounded-arrays- Never allow arrays to grow without limitantipattern-bloated-documents- Keep documents under 16KB for working setantipattern-massive-arrays- Arrays over 1000 elements hurt performanceantipattern-unnecessary-collections- Fewer collections, more embeddingantipattern-excessive-lookups- Reduce $lookup by denormalizingantipattern-schema-drift- Enforce consistent structure across documentsantipattern-unnecessary-indexes- Audit and remove unused or redundant indexes
2. Schema Fundamentals (HIGH) - 5 rules
fundamental-embed-vs-reference- Decision framework for relationshipsfundamental-data-together- Data accessed together stored togetherfundamental-document-model- Embrace documents, avoid SQL patternsfundamental-schema-validation- Enforce structure with JSON Schemafundamental-16mb-awareness- Design around BSON document limit
3. Relationship Patterns (HIGH) - 6 rules
relationship-one-to-one- Embed for simplicity, reference for independencerelationship-one-to-few- Embed bounded arrays (addresses, phone numbers)relationship-one-to-many- Reference for large/unbounded relationshipsrelationship-one-to-squillions- Reference massive child sets, store summariesrelationship-many-to-many- Two-way referencing for bidirectional accessrelationship-tree-structures- Parent/child/materialized path patterns
4. Design Patterns (MEDIUM) - 12 rules
pattern-archive- Move historical data to separate storage for performancepattern-attribute- Collapse many optional fields into key-value attributespattern-bucket- Group time-series or IoT data into bucketspattern-time-series-collections- Use native time series collections when availablepattern-extended-reference- Cache frequently-accessed related datapattern-subset- Store hot data in main doc, cold data elsewherepattern-computed- Pre-calculate expensive aggregationspattern-outlier- Handle documents with exceptional array sizespattern-polymorphic- Store heterogeneous documents with a type discriminatorpattern-schema-versioning- Evolve schemas safely with version fields
5. Schema Validation (MEDIUM) - 3 rules
validation-json-schema- Validate data types and structure at database levelvalidation-action-levels- Choose warn vs error mode for validationvalidation-rollout-strategy- Introduce validation safely in production
Key Principle
"Data that is accessed together should be stored together."
This is MongoDB's core philosophy. Embedding related data eliminates joins, reduces round trips, and enables atomic updates. Reference only when you must.
Decision Framework
| Relationship | Cardinality | Access Pattern | Recommendation |
|---|---|---|---|
| One-to-One | 1:1 | Always together | Embed |
| One-to-Few | 1:N (N < 100) | Usually together | Embed array |
| One-to-Many | 1:N (N > 100) | Often separate | Reference |
| Many-to-Many | M:N | Varies | Two-way reference |
How to Use
Read individual rule files for detailed explanations and code examples:
rules/antipattern-unbounded-arrays.md
rules/relationship-one-to-many.md
rules/_sections.md
Each rule file contains:
- Brief explanation of why it matters
- Incorrect code example with explanation
- Correct code example with explanation
- "When NOT to use" exceptions
- Performance impact and metrics
- Verification diagnostics
How These Rules Work
Recommendations with Verification
Every rule in this skill provides:
- A recommendation based on best practices
- A verification checklist of things that should be confirmed
- Commands to verify so you can check before implementing
- MCP integration for automatic verification when connected
Why Verification Matters
I analyze code patterns, but I can't see your actual data without a database connection. This means I might suggest:
- Fixing an "unbounded array" that's actually small and bounded
- Restructuring a schema that works well for your access patterns
- Adding validation when documents already conform to the schema
Always verify before implementing. Each rule includes verification commands.
MongoDB MCP Integration
For automatic verification, connect the MongoDB MCP Server:
Option 1: Connection String
{
"mcpServers": {
"mongodb": {
"command": "npx",
"args": ["-y", "mongodb-mcp-server", "--readOnly"],
"env": {
"MDB_MCP_CONNECTION_STRING": "mongodb+srv://user:pass@cluster.mongodb.net/mydb"
}
}
}
}
Option 2: Local MongoDB
{
"mcpServers": {
"mongodb": {
"command": "npx",
"args": ["-y", "mongodb-mcp-server", "--readOnly"],
"env": {
"MDB_MCP_CONNECTION_STRING": "mongodb://localhost:27017/mydb"
}
}
}
}
⚠️ Security: Use --readOnly for safety. Remove only if you need write operations.
When connected, I can automatically:
- Infer schema via
mcp__mongodb__collection-schema - Measure document/array sizes via
mcp__mongodb__aggregate - Check collection statistics via
mcp__mongodb__db-stats
⚠️ Action Policy
I will NEVER execute write operations without your explicit approval.
| Operation Type | MCP Tools | Action |
|---|---|---|
| Read (Safe) | find, aggregate, collection-schema, db-stats, count |
I may run automatically to verify |
| Write (Requires Approval) | update-many, insert-many, create-collection |
I will show the command and wait for your "yes" |
| Destructive (Requires Approval) | delete-many, drop-collection, drop-database |
I will warn you and require explicit confirmation |
When I recommend schema changes or data modifications:
- I'll explain what I want to do and why
- I'll show you the exact command
- I'll wait for your approval before executing
- If you say "go ahead" or "yes", only then will I run it
Your database, your decision. I'm here to advise, not to act unilaterally.
Working Together
If you're not sure about a recommendation:
- Run the verification commands I provide
- Share the output with me
- I'll adjust my recommendation based on your actual data
We're a team—let's get this right together.
Full Compiled Document
For the complete guide with all rules expanded: AGENTS.md