Database Schema Designer
Design production-ready database schemas with best practices built-in.
Quick Start
Just describe your data model:
design a schema for an e-commerce platform with users, products, orders
You'll get a complete SQL schema like:
CREATE TABLE users (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
email VARCHAR(255) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE orders (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
user_id BIGINT NOT NULL REFERENCES users(id),
total DECIMAL(10,2) NOT NULL,
INDEX idx_orders_user (user_id)
);
What to include in your request:
- Entities (users, products, orders)
- Key relationships (users have orders, orders have items)
- Scale hints (high-traffic, millions of records)
- Database preference (SQL/NoSQL) - defaults to SQL if not specified
Triggers
| Trigger |
Example |
design schema |
"design a schema for user authentication" |
database design |
"database design for multi-tenant SaaS" |
create tables |
"create tables for a blog system" |
schema for |
"schema for inventory management" |
model data |
"model data for real-time analytics" |
I need a database |
"I need a database for tracking orders" |
design NoSQL |
"design NoSQL schema for product catalog" |
Key Terms
| Term |
Definition |
| Normalization |
Organizing data to reduce redundancy (1NF → 2NF → 3NF) |
| 3NF |
Third Normal Form - no transitive dependencies between columns |
| OLTP |
Online Transaction Processing - write-heavy, needs normalization |
| OLAP |
Online Analytical Processing - read-heavy, benefits from denormalization |
| Foreign Key (FK) |
Column that references another table's primary key |
| Index |
Data structure that speeds up queries (at cost of slower writes) |
| Access Pattern |
How your app reads/writes data (queries, joins, filters) |
| Denormalization |
Intentionally duplicating data to speed up reads |
Quick Reference
| Task |
Approach |
Key Consideration |
| New schema |
Normalize to 3NF first |
Domain modeling over UI |
| SQL vs NoSQL |
Access patterns decide |
Read/write ratio matters |
| Primary keys |
INT or UUID |
UUID for distributed systems |
| Foreign keys |
Always constrain |
ON DELETE strategy critical |
| Indexes |
FKs + WHERE columns |
Column order matters |
| Migrations |
Always reversible |
Backward compatible first |
Process Overview
Your Data Requirements
|
v
+-----------------------------------------------------+
| Phase 1: ANALYSIS |
| * Identify entities and relationships |
| * Determine access patterns (read vs write heavy) |
| * Choose SQL or NoSQL based on requirements |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Phase 2: DESIGN |
| * Normalize to 3NF (SQL) or embed/reference (NoSQL) |
| * Define primary keys and foreign keys |
| * Choose appropriate data types |
| * Add constraints (UNIQUE, CHECK, NOT NULL) |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Phase 3: OPTIMIZE |
| * Plan indexing strategy |
| * Consider denormalization for read-heavy queries |
| * Add timestamps (created_at, updated_at) |
+-----------------------------------------------------+
|
v
+-----------------------------------------------------+
| Phase 4: MIGRATE |
| * Generate migration scripts (up + down) |
| * Ensure backward compatibility |
| * Plan zero-downtime deployment |
+-----------------------------------------------------+
|
v
Production-Ready Schema
Commands
| Command |
When to Use |
Action |
design schema for {domain} |
Starting fresh |
Full schema generation |
normalize {table} |
Fixing existing table |
Apply normalization rules |
add indexes for {table} |
Performance issues |
Generate index strategy |
migration for {change} |
Schema evolution |
Create reversible migration |
review schema |
Code review |
Audit existing schema |
Workflow: Start with design schema → iterate with normalize → optimize with add indexes → evolve with migration
Core Principles
| Principle |
WHY |
Implementation |
| Model the Domain |
UI changes, domain doesn't |
Entity names reflect business concepts |
| Data Integrity First |
Corruption is costly to fix |
Constraints at database level |
| Optimize for Access Pattern |
Can't optimize for both |
OLTP: normalized, OLAP: denormalized |
| Plan for Scale |
Retrofitting is painful |
Index strategy + partitioning plan |
Anti-Patterns
| Avoid |
Why |
Instead |
| VARCHAR(255) everywhere |
Wastes storage, hides intent |
Size appropriately per field |
| FLOAT for money |
Rounding errors |
DECIMAL(10,2) |
| Missing FK constraints |
Orphaned data |
Always define foreign keys |
| No indexes on FKs |
Slow JOINs |
Index every foreign key |
| Storing dates as strings |
Can't compare/sort |
DATE, TIMESTAMP types |
| SELECT * in queries |
Fetches unnecessary data |
Explicit column lists |
| Non-reversible migrations |
Can't rollback |
Always write DOWN migration |
| Adding NOT NULL without default |
Breaks existing rows |
Add nullable, backfill, then constrain |
Verification Checklist
After designing a schema:
Normal Forms
| Form |
Rule |
Violation Example |
| 1NF |
Atomic values, no repeating groups |
product_ids = '1,2,3' |
| 2NF |
1NF + no partial dependencies |
customer_name in order_items |
| 3NF |
2NF + no transitive dependencies |
country derived from postal_code |
1st Normal Form (1NF)
CREATE TABLE orders (
id INT PRIMARY KEY,
product_ids VARCHAR(255)
);
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT
);
CREATE TABLE order_items (
id INT PRIMARY KEY,
order_id INT REFERENCES orders(id),
product_id INT
);
2nd Normal Form (2NF)
CREATE TABLE order_items (
order_id INT,
product_id INT,
customer_name VARCHAR(100),
PRIMARY KEY (order_id, product_id)
);
CREATE TABLE customers (
id INT PRIMARY KEY,
name VARCHAR(100)
);
3rd Normal Form (3NF)
CREATE TABLE customers (
id INT PRIMARY KEY,
postal_code VARCHAR(10),
country VARCHAR(50)
);
CREATE TABLE postal_codes (
code VARCHAR(10) PRIMARY KEY,
country VARCHAR(50)
);
When to Denormalize
| Scenario |
Denormalization Strategy |
| Read-heavy reporting |
Pre-calculated aggregates |
| Expensive JOINs |
Cached derived columns |
| Analytics dashboards |
Materialized views |
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT,
total_amount DECIMAL(10,2),
item_count INT
);
String Types
| Type |
Use Case |
Example |
| CHAR(n) |
Fixed length |
State codes, ISO dates |
| VARCHAR(n) |
Variable length |
Names, emails |
| TEXT |
Long content |
Articles, descriptions |
email VARCHAR(255)
phone VARCHAR(20)
country_code CHAR(2)
Numeric Types
| Type |
Range |
Use Case |
| TINYINT |
-128 to 127 |
Age, status codes |
| SMALLINT |
-32K to 32K |
Quantities |
| INT |
-2.1B to 2.1B |
IDs, counts |
| BIGINT |
Very large |
Large IDs, timestamps |
| DECIMAL(p,s) |
Exact precision |
Money |
| FLOAT/DOUBLE |
Approximate |
Scientific data |
price DECIMAL(10, 2)
price FLOAT
Date/Time Types
DATE
TIME
DATETIME
TIMESTAMP
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
Boolean
is_active BOOLEAN DEFAULT TRUE
is_active TINYINT(1) DEFAULT 1
When to Create Indexes
| Always Index |
Reason |
| Foreign keys |
Speed up JOINs |
| WHERE clause columns |
Speed up filtering |
| ORDER BY columns |
Speed up sorting |
| Unique constraints |
Enforced uniqueness |
CREATE INDEX idx_orders_customer ON orders(customer_id);
CREATE INDEX idx_orders_status_date ON orders(status, created_at);
Index Types
| Type |
Best For |
Example |
| B-Tree |
Ranges, equality |
price > 100 |
| Hash |
Exact matches only |
email = 'x@y.com' |
| Full-text |
Text search |
MATCH AGAINST |
| Partial |
Subset of rows |
WHERE is_active = true |
Composite Index Order
CREATE INDEX idx_customer_status ON orders(customer_id, status);
SELECT * FROM orders WHERE customer_id = 123;
SELECT * FROM orders WHERE customer_id = 123 AND status = 'pending';
SELECT * FROM orders WHERE status = 'pending';
Rule: Most selective column first, or column most queried alone.
Index Pitfalls
| Pitfall |
Problem |
Solution |
| Over-indexing |
Slow writes |
Only index what's queried |
| Wrong column order |
Unused index |
Match query patterns |
| Missing FK indexes |
Slow JOINs |
Always index FKs |
Primary Keys
id INT AUTO_INCREMENT PRIMARY KEY
id CHAR(36) PRIMARY KEY DEFAULT (UUID())
PRIMARY KEY (student_id, course_id)
Foreign Keys
FOREIGN KEY (customer_id) REFERENCES customers(id)
ON DELETE CASCADE
ON DELETE RESTRICT
ON DELETE SET NULL
ON UPDATE CASCADE
| Strategy |
Use When |
| CASCADE |
Dependent data (order_items) |
| RESTRICT |
Important references (prevent accidents) |
| SET NULL |
Optional relationships |
Other Constraints
email VARCHAR(255) UNIQUE NOT NULL
UNIQUE (student_id, course_id)
price DECIMAL(10,2) CHECK (price >= 0)
discount INT CHECK (discount BETWEEN 0 AND 100)
name VARCHAR(100) NOT NULL
One-to-Many
CREATE TABLE orders (
id INT PRIMARY KEY,
customer_id INT NOT NULL REFERENCES customers(id)
);
CREATE TABLE order_items (
id INT PRIMARY KEY,
order_id INT NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
product_id INT NOT NULL,
quantity INT NOT NULL
);
Many-to-Many
CREATE TABLE enrollments (
student_id INT REFERENCES students(id) ON DELETE CASCADE,
course_id INT REFERENCES courses(id) ON DELETE CASCADE,
enrolled_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (student_id, course_id)
);
Self-Referencing
CREATE TABLE employees (
id INT PRIMARY KEY,
name VARCHAR(100) NOT NULL,
manager_id INT REFERENCES employees(id)
);
Polymorphic
CREATE TABLE comments (
id INT PRIMARY KEY,
content TEXT NOT NULL,
post_id INT REFERENCES posts(id),
photo_id INT REFERENCES photos(id),
CHECK (
(post_id IS NOT NULL AND photo_id IS NULL) OR
(post_id IS NULL AND photo_id IS NOT NULL)
)
);
CREATE TABLE comments (
id INT PRIMARY KEY,
content TEXT NOT NULL,
commentable_type VARCHAR(50) NOT NULL,
commentable_id INT NOT NULL
);
Embedding vs Referencing
| Factor |
Embed |
Reference |
| Access pattern |
Read together |
Read separately |
| Relationship |
1:few |
1:many |
| Document size |
Small |
Approaching 16MB |
| Update frequency |
Rarely |
Frequently |
Embedded Document
{
"_id": "order_123",
"customer": {
"id": "cust_456",
"name": "Jane Smith",
"email": "jane@example.com"
},
"items": [
{ "product_id": "prod_789", "quantity": 2, "price": 29.99 }
],
"total": 109.97
}
Referenced Document
{
"_id": "order_123",
"customer_id": "cust_456",
"item_ids": ["item_1", "item_2"],
"total": 109.97
}
MongoDB Indexes
db.users.createIndex({ email: 1 }, { unique: true });
db.orders.createIndex({ customer_id: 1, created_at: -1 });
db.articles.createIndex({ title: "text", content: "text" });
db.stores.createIndex({ location: "2dsphere" });
Migration Best Practices
| Practice |
WHY |
| Always reversible |
Need to rollback |
| Backward compatible |
Zero-downtime deploys |
| Schema before data |
Separate concerns |
| Test on staging |
Catch issues early |
Adding a Column (Zero-Downtime)
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
UPDATE users SET phone = '' WHERE phone IS NULL;
ALTER TABLE users MODIFY phone VARCHAR(20) NOT NULL;
Renaming a Column (Zero-Downtime)
ALTER TABLE users ADD COLUMN email_address VARCHAR(255);
UPDATE users SET email_address = email;
ALTER TABLE users DROP COLUMN email;
Migration Template
BEGIN;
ALTER TABLE users ADD COLUMN phone VARCHAR(20);
CREATE INDEX idx_users_phone ON users(phone);
COMMIT;
BEGIN;
DROP INDEX idx_users_phone ON users;
ALTER TABLE users DROP COLUMN phone;
COMMIT;
Query Analysis
EXPLAIN SELECT * FROM orders
WHERE customer_id = 123 AND status = 'pending';
| Look For |
Meaning |
| type: ALL |
Full table scan (bad) |
| type: ref |
Index used (good) |
| key: NULL |
No index used |
| rows: high |
Many rows scanned |
N+1 Query Problem
orders = db.query("SELECT * FROM orders")
for order in orders:
customer = db.query(f"SELECT * FROM customers WHERE id = {order.customer_id}")
results = db.query("""
SELECT orders.*, customers.name
FROM orders
JOIN customers ON orders.customer_id = customers.id
""")
Optimization Techniques
| Technique |
When to Use |
| Add indexes |
Slow WHERE/ORDER BY |
| Denormalize |
Expensive JOINs |
| Pagination |
Large result sets |
| Caching |
Repeated queries |
| Read replicas |
Read-heavy load |
| Partitioning |
Very large tables |
Extension Points
- Database-Specific Patterns: Add MySQL vs PostgreSQL vs SQLite variations
- Advanced Patterns: Time-series, event sourcing, CQRS, multi-tenancy
- ORM Integration: TypeORM, Prisma, SQLAlchemy patterns
- Monitoring: Query performance tracking, slow query alerts