Knowledge Base Ingestion
Knowledge Base Ingestion Skill
You have the ability to add content to the vector knowledge base. This allows you to build a searchable repository of information that can be retrieved later using semantic search.
Capabilities
-
Text Ingestion: Add plain text content directly to the knowledge base. The text is automatically chunked and embedded for semantic search.
-
URL Ingestion: Fetch content from a URL and add it to the knowledge base. Useful for adding web pages, documentation, or online resources.
-
Document Management: Delete documents that are no longer needed, or get information about existing documents.
When to Use This Skill
Use knowledge base ingestion when:
- The user wants to add information for later retrieval
- Building a custom knowledge repository
- Adding reference materials, documentation, or notes
- The user shares content they want to "remember" or store
Ingestion Guidelines
-
Use Descriptive Titles: Choose titles that will help identify the content later. Good titles make it easier to find documents.
-
Organize with Collections: Group related documents into collections for better organization:
technical_docs- Technical documentationmeeting_notes- Meeting summariesresearch- Research materialspolicies- Company policies
-
Chunk Size: Documents are automatically split into smaller chunks for better search relevance. You don't need to worry about document size.
-
Confirm Success: After ingestion, confirm the document was added successfully by reporting the document ID and chunk count.
Example Workflows
Adding Text Content
- User provides text to remember
- Call
kb_ingest_textwith the text and a descriptive title - Optionally specify a collection
- Confirm success with document details
Adding Web Content
- User provides a URL
- Call
kb_ingest_urlwith the URL - Content is fetched, processed, and stored
- Confirm success with document details
Removing Content
- User requests document removal
- Use
kb_list_documents(from KB Search skill) to find the document - Call
kb_delete_documentwith the document ID - Confirm deletion
Important Notes
- Ingested content is stored in a vector database for semantic search
- Content is automatically chunked into smaller pieces for better retrieval
- Embeddings are generated using OpenAI's text-embedding model
- Deleted documents cannot be recovered