database
SKILL.md
Database Skill
Task Router
| Domain | Reference File | When to Use |
|---|---|---|
| SQL optimization | references/sql-optimization.md |
EXPLAIN analysis, slow queries, index strategy, CTEs, window functions |
| PostgreSQL | references/postgresql.md |
PG-specific features, JSONB, full-text search, partitioning, pgvector |
| NoSQL | references/nosql.md |
MongoDB, Redis, DynamoDB, document/key-value/wide-column patterns |
| ORM patterns | references/orm-patterns.md |
Prisma, Drizzle, SQLAlchemy, TypeORM — when to drop to raw SQL |
| Migrations | references/migrations.md |
Schema changes, zero-downtime deploys, rollbacks, data backfills |
| Data modeling | references/data-modeling.md |
Schema design, normalization, relationships, audit trails |
| Indexing deep dive | references/indexing-deep-dive.md |
B-tree, GIN, BRIN, GiST, SP-GiST, partial indexes, expression indexes, fill factor, HOT updates, index bloat, REINDEX CONCURRENTLY, index selectivity, covering indexes |
| Replication & sharding | references/replication-sharding.md |
Streaming replication, logical replication, replication slots, Patroni, pg_auto_failover, CDC, Debezium, read replicas, range/hash/directory sharding, hot spot avoidance, CockroachDB, YugabyteDB, re-sharding |
| Query optimization case studies | references/query-optimization-case-studies.md |
EXPLAIN ANALYZE deep dive, N+1 queries, missing FK indexes, query rewriting, keyset pagination, pg_stat_statements, plan regression, cost model tuning |
| Connection pooling | references/connection-pooling-operations.md |
PgBouncer modes/config/sizing, pool size formula, connection leaks, idle in transaction, multi-tenant pooling, RDS Proxy, Supavisor |
| Transactions & concurrency | references/transactions-concurrency.md |
Isolation levels, MVCC, dead tuples, deadlock avoidance, SELECT FOR UPDATE, SKIP LOCKED, advisory locks, serialization failure retry, optimistic vs pessimistic locking, write skew |
| Monitoring & observability | references/monitoring-observability.md |
pg_stat_statements, cache hit ratio, dead tuple ratio, replication lag, slow query log, Prometheus postgres_exporter, Grafana dashboards, auto_explain, pgBadger, capacity planning |
Before Starting
Answer these questions before writing any database code:
- What database engine? PostgreSQL, MySQL, SQLite, MongoDB, Redis, DynamoDB — each has different strengths.
- Read-heavy or write-heavy? Read-heavy favors denormalization and caching. Write-heavy favors normalization and append-only patterns.
- What scale? Single node < 1M rows needs no heroics. 10M+ rows needs indexing strategy. 100M+ needs partitioning/sharding.
- What consistency model? Strong consistency (ACID) vs eventual consistency — pick based on business requirements.
- Who runs migrations? CI/CD pipeline, manual DBA approval, or ORM auto-sync — determines migration tooling.
Quick Reference
Index Types
| Type | Use Case | Engine |
|---|---|---|
| B-tree | Default, range queries, sorting | All |
| Hash | Exact equality only | PG, MySQL (Memory) |
| GIN | JSONB, arrays, full-text search | PG |
| GiST | Geometric, range types, proximity | PG |
| BRIN | Large sequential tables (timestamps) | PG |
| Partial | Filter on subset of rows | PG |
| Covering | Include columns to avoid table lookup | PG, MySQL 8+ |
Common EXPLAIN Flags
| Flag | Meaning | Action |
|---|---|---|
| Seq Scan | Full table scan | Add index or check WHERE clause |
| Index Scan | Using index | Good — verify selectivity |
| Bitmap Heap Scan | Multiple index matches | Normal for OR conditions |
| Nested Loop | Row-by-row join | Fine for small sets, bad for large |
| Hash Join | Hash-based join | Good for large equi-joins |
| Sort | In-memory or disk sort | Add index matching ORDER BY |
SQL Cheat Sheet
-- Upsert (PostgreSQL)
INSERT INTO t (id, val) VALUES (1, 'x')
ON CONFLICT (id) DO UPDATE SET val = EXCLUDED.val;
-- Window function: running total
SELECT *, SUM(amount) OVER (PARTITION BY user_id ORDER BY created_at) AS running_total
FROM orders;
-- CTE with recursive
WITH RECURSIVE tree AS (
SELECT id, parent_id, name, 0 AS depth FROM categories WHERE parent_id IS NULL
UNION ALL
SELECT c.id, c.parent_id, c.name, t.depth + 1 FROM categories c JOIN tree t ON c.parent_id = t.id
) SELECT * FROM tree;
Decision Trees
SQL vs NoSQL
- Default to SQL unless you have a specific reason not to.
- Use MongoDB when: schema is truly unpredictable, document-oriented access, rapid prototyping with evolving schema.
- Use Redis when: caching, session storage, rate limiting, pub/sub, leaderboards.
- Use DynamoDB when: AWS-native, single-digit ms latency at any scale, known access patterns upfront.
Which SQL Engine
- PostgreSQL: Default choice. Best extension ecosystem, JSONB, full-text search, pgvector.
- MySQL: Legacy apps, WordPress, simple replication. Use 8.0+ for CTEs and window functions.
- SQLite: Embedded, local-first apps, testing, single-writer workloads. Surprisingly powerful at scale with WAL mode.
Which ORM
- Prisma: TypeScript projects, excellent DX, schema-first, auto-migrations. Avoid for complex raw SQL needs.
- Drizzle: TypeScript projects wanting SQL-like syntax, lighter weight, better raw SQL escape hatch.
- SQLAlchemy: Python gold standard. Use Core for performance, ORM for productivity.
- TypeORM: Legacy TS projects. Prefer Prisma or Drizzle for new projects.
Core Principles
- Normalize first, denormalize for measured performance problems. Start at 3NF. Only denormalize when you have query metrics proving a join is the bottleneck.
- Every query must have an index strategy. Run EXPLAIN on every query that touches production. No exceptions.
- Migrations are code. Version-control every schema change. Never modify production schemas by hand.
- Transactions protect invariants. Use transactions for any multi-statement operation that must be atomic. Set appropriate isolation levels.
- Connection pooling is mandatory in production. Use PgBouncer, RDS Proxy, or ORM-level pooling. Never open unbounded connections.
- Backups are worthless until tested. Schedule regular restore drills. Point-in-time recovery > daily dumps.
- Prefer append-only patterns for audit trails. Soft deletes + created_at/updated_at on every table. Use triggers or application-level hooks.
Anti-Patterns
| Anti-Pattern | Why It Hurts | Fix |
|---|---|---|
| No indexes on foreign keys | JOIN performance degrades linearly | Add B-tree index on every FK column |
| N+1 queries | 1 query + N child queries = N+1 round trips | Use JOIN, subquery, or ORM eager loading |
| Schema changes without migrations | Drift between environments, unreproducible state | Use migration tool (Prisma Migrate, Alembic, Flyway) |
| SELECT * in production | Fetches unnecessary columns, breaks on schema change | Explicitly list needed columns |
| Missing connection limits | One runaway service exhausts DB connections | Configure pool size, use PgBouncer |
| Storing money as FLOAT | Floating point rounding errors | Use DECIMAL/NUMERIC or integer cents |
| Unbounded queries | No LIMIT on user-facing queries = OOM risk | Always paginate with LIMIT/OFFSET or cursor |
| Mixing DDL and DML in one transaction | Some engines auto-commit on DDL, breaking atomicity | Separate DDL migrations from data migrations |
Weekly Installs
1
Repository
george11642/geo…-pluginsGitHub Stars
2
First Seen
9 days ago
Security Audits
Installed on
amp1
cline1
opencode1
cursor1
kimi-cli1
codex1