rag-architect
Installation
SKILL.md
RAG Architect
Design, tune, and evaluate production RAG pipelines with three deterministic tools. Run the tools against the actual corpus and requirements — do not pick chunk sizes or databases by intuition.
Hard rules
- Never present model names or vendor prices as current facts. Embedding models and vector-DB pricing rot in months. Recommend a tier (see table below), name a current-generation candidate, and tell the user to verify against the provider's live pricing page.
- Every design ends with an evaluation run. A RAG design without
retrieval_evaluator.pynumbers is a hypothesis, not a deliverable. - Chunking is corpus-driven. Run
chunking_optimizer.pyon the real documents before choosing a strategy.
Embedding model tiers (pattern, not price list)
| Tier | Current-generation examples (verify before use) | When |
|---|---|---|
| Fast / self-hosted | all-MiniLM-L6-v2, bge-small |
Cost-sensitive, small scale, real-time |
| Balanced open | all-mpnet-base-v2, bge-large, e5-large |
Quality without API dependency |
| Quality API | text-embedding-3-large, voyage-3-large |
Accuracy-priority general retrieval |
| Code | voyage-code-3, CodeBERT-family |
Code search corpora |