seed-data-generator
Seed Data Generator Protocol
This skill helps developers populate empty local or staging databases with massive amounts of realistic data for load testing and UI development.
Core assumption: Simple random strings (asdfgh) are useless for UI testing. Seed data must look real and respect Foreign Key constraints to successfully insert.
1. Schema Analysis & Topological Sort
Before generating data, read the schema and understand the relationships:
- If
ordersdepends onusersandproducts. - If
order_itemsdepends onordersandproducts. - Topological Sort (Insert Order):
users->products->orders->order_items.
(Never try to insert an order item before the order exists).
2. Smart Field Generation (Faking)
Map column names and data types to specific Faker generators:
email-> Faker.Internet.Email()first_name,last_name,full_name-> Faker.Person.FullName()status(VARCHAR) -> Random pick from('active', 'pending', 'cancelled').description,bio-> Faker.Lorem.Paragraph()created_at-> Random Timestamp betweenNOW() - 1 yearandNOW().
3. Output Generation
Provide an executable seeder script (TypeScript/Prisma, Python, or raw SQL depending on the user's stack). Raw SQL is the default.
Required Outputs (Must write BOTH to docs/database-report/):
- Human-Readable Markdown (
docs/database-report/seed-data-report.md)
### 🔗 Dependency Graph Resolution
Insert Order:
1. `companies`
2. `users` (Depends on `companies`)
3. `posts` (Depends on `users`)
### 🛠️ Seed Script (Raw SQL)
```sql
-- Disable triggers temporarily for fast bulk inserts
SET session_replication_role = 'replica';
-- 1. Insert Companies
INSERT INTO companies (id, name, created_at) VALUES
('c1', 'Acme Corp', '2023-01-15 10:00:00'),
('c2', 'Globex', '2023-02-20 11:30:00');
-- 2. Insert Users
INSERT INTO users (id, company_id, email, first_name) VALUES
('u1', 'c1', 'john.acme@example.com', 'John'),
('u2', 'c2', 'sarah.globex@example.com', 'Sarah');
-- Re-enable triggers
SET session_replication_role = 'origin';
2. **Machine-Readable JSON (`docs/database-report/seed-data-output.json`)**
```json
{
"skill": "seed-data-generator",
"insertion_order": ["companies", "users", "posts"],
"faker_mappings": {
"users.email": "Faker.Internet.Email()",
"companies.name": "Faker.Company.CompanyName()"
},
"rows_generated": {
"companies": 2,
"users": 2
}
}
Guardrails
- Performance: For requesting >10,000 rows, do not output literal SQL
INSERTstatements. Instead, output a Python/Node script usingfakerand fast bulkCOPYcommands. - Unique Constraints: Be extremely careful with random generators hitting duplicate values on
UNIQUEcolumns. Appendidor sequence numbers to emails/usernames if necessary. - Environment: Warn the user to NEVER run seed scripts in production.
More from fatih-developer/fth-skills
task-decomposer
Break down large, complex, or ambiguous tasks into independent subtasks with dependency maps, execution order, and success criteria. Plan first, then execute step by step. Triggers on 'how should I do this', 'where do I start', 'plan the project', 'break it down', 'implement' or whenever a task involves multiple phases.
24context-compressor
Compress long conversation histories, large code files, research results, and documents by 70% without losing critical information. Triggers when context window fills up, when summarizing previous steps in multi-step tasks, before loading large files into context, or on 'summarize', 'compress', 'reduce context', 'save tokens'.
18multi-brain-debate
Two-round debate protocol where perspectives challenge each other before consensus. Round 1 presents independent positions, Round 2 allows counter-arguments and rebuttals. Produces battle-tested decisions for high-stakes choices.
17multi-brain-score
Confidence scoring overlay for multi-brain decisions. Each perspective rates its own confidence (1-10) with justification. Consensus uses scores as weights, flags low-confidence areas, and surfaces uncertainty explicitly.
15checkpoint-guardian
Automatic risk assessment before every critical action in agentic workflows. Detects irreversible operations (file deletion, database writes, deployments, payments), classifies risk level, and requires confirmation before proceeding. Triggers on destructive keywords like deploy, delete, send, publish, update database, process payment.
14parallel-planner
Analyze multi-step tasks to identify which steps can run in parallel, build dependency graphs, detect conflicts (write-write, read-write, resource contention), and produce optimized execution plans. Triggers on 3+ independent steps, 'speed up', 'run simultaneously', 'parallelize', 'optimize' or any task where sequential execution wastes time.
14