skills/oakoss/agent-skills/knowledge-base-manager

knowledge-base-manager

SKILL.md

Knowledge Base Manager

Overview

Provides a structured methodology for selecting, designing, and governing knowledge bases. Covers architecture decisions (document-based vs entity-based vs hybrid), content curation, quality metrics, versioning strategies, and maintenance governance. Use when choosing a KB architecture, establishing curation workflows, or building governance processes for organizational knowledge.

When NOT to use: Static documentation suffices, fewer than 50 FAQ items cover all questions, or no maintenance resources are available. For implementing retrieval pipelines (chunking, embeddings, vector stores), use the rag-implementer skill. For implementing knowledge graphs (ontology, entity extraction, graph databases), use the knowledge-graph-builder skill.

Quick Reference

Aspect Options Key Considerations
Architecture Document-based (RAG), Entity-based (Graph), Hybrid Match to query patterns; start simple, add complexity when needed
Document-based Vector DB (Pinecone, Weaviate, pgvector) Best for docs, FAQs, manuals; semantic search; easy to add content
Entity-based Graph DB (Neo4j, ArangoDB) Best for org charts, catalogs, networks; relationship traversal
Hybrid Both + linking layer Enterprise, medical, legal; combined queries; highest complexity
When to skip KB Static docs, <50 FAQ items No maintenance resources, information never changes
Implementation 6 phases Audit, Curation, Storage, Quality, Versioning, Governance
Accuracy target >90% on test questions Create 100+ test questions with known correct answers
Coverage target >80% questions answerable Validate against real user queries continuously
Freshness target <30 days average age Automated freshness monitoring + scheduled updates
Consistency target >95% conflict-free Deduplication + single source of truth
Query latency <100ms median Caching and optimization for common access patterns
Storage tech pgvector, Pinecone, Weaviate, Chroma pgvector for existing Postgres; Pinecone for managed scale
Index types HNSW, IVFFlat HNSW for recall; IVFFlat for frequently rebuilt indexes
Ingestion pipeline Load, clean, chunk, embed, store Chunk at semantic boundaries; 512 tokens max; 10-15% overlap
Deduplication Content hashing, semantic similarity Hash for exact dupes; cosine similarity >0.95 for semantic dupes
Quality testing Recall@K, MRR, accuracy sampling 100+ test questions; measure recall@10 >0.8 and MRR >0.7
Drift detection Embedding distribution monitoring Track mean shift; alert when >0.1 threshold
Versioning Snapshot, Event-sourced, Git-style Snapshot for simple; event-sourced for audit; git-style for teams
Maintenance Daily, Weekly, Monthly, Quarterly Establish schedule from day 1; monitor errors and user feedback

Common Mistakes

Mistake Correct Pattern
Ingesting raw data without curation or normalization Curate, clean, and deduplicate before ingesting; quality over quantity
Skipping version control for KB content Implement versioning from day one with rollback and audit trail
Building a KB without validating against user questions Start with user research and test against real queries for >90% accuracy
Choosing hybrid architecture when document-based suffices Match architecture to actual query patterns; start simple, add complexity when needed
Launching without freshness monitoring or update schedules Set up automated freshness checks and scheduled content reviews
No provenance tracking on knowledge entries Always track source URL, timestamp, author, and confidence score
Duplicate information across sources Establish single source of truth; merge similar entries with conflict resolution rules
Perfectionism delaying launch Launch at 80% coverage and iterate based on real usage data

Delegation

  • Audit existing knowledge sources and classify content types: Use Explore agent to inventory documents, assess quality, and identify gaps
  • Implement end-to-end KB pipeline with storage and retrieval: Use Task agent to deploy database, configure search, and run quality checks
  • Design KB architecture and governance model: Use Plan agent to select between document-based, entity-based, or hybrid approaches

For implementing document retrieval pipelines (chunking, embeddings, vector stores, hybrid search), use the rag-implementer skill. For implementing knowledge graphs (ontology design, entity extraction, graph databases), use the knowledge-graph-builder skill.

References

Weekly Installs
12
GitHub Stars
4
First Seen
Feb 24, 2026
Installed on
claude-code10
opencode9
github-copilot9
codex9
kimi-cli9
gemini-cli9