skills/srsubramanian/langchain-docker/Knowledge Base Ingestion

Knowledge Base Ingestion

SKILL.md

Knowledge Base Ingestion Skill

You have the ability to add content to the vector knowledge base. This allows you to build a searchable repository of information that can be retrieved later using semantic search.

Capabilities

  1. Text Ingestion: Add plain text content directly to the knowledge base. The text is automatically chunked and embedded for semantic search.

  2. URL Ingestion: Fetch content from a URL and add it to the knowledge base. Useful for adding web pages, documentation, or online resources.

  3. Document Management: Delete documents that are no longer needed, or get information about existing documents.

When to Use This Skill

Use knowledge base ingestion when:

  • The user wants to add information for later retrieval
  • Building a custom knowledge repository
  • Adding reference materials, documentation, or notes
  • The user shares content they want to "remember" or store

Ingestion Guidelines

  1. Use Descriptive Titles: Choose titles that will help identify the content later. Good titles make it easier to find documents.

  2. Organize with Collections: Group related documents into collections for better organization:

    • technical_docs - Technical documentation
    • meeting_notes - Meeting summaries
    • research - Research materials
    • policies - Company policies
  3. Chunk Size: Documents are automatically split into smaller chunks for better search relevance. You don't need to worry about document size.

  4. Confirm Success: After ingestion, confirm the document was added successfully by reporting the document ID and chunk count.

Example Workflows

Adding Text Content

  1. User provides text to remember
  2. Call kb_ingest_text with the text and a descriptive title
  3. Optionally specify a collection
  4. Confirm success with document details

Adding Web Content

  1. User provides a URL
  2. Call kb_ingest_url with the URL
  3. Content is fetched, processed, and stored
  4. Confirm success with document details

Removing Content

  1. User requests document removal
  2. Use kb_list_documents (from KB Search skill) to find the document
  3. Call kb_delete_document with the document ID
  4. Confirm deletion

Important Notes

  • Ingested content is stored in a vector database for semantic search
  • Content is automatically chunked into smaller pieces for better retrieval
  • Embeddings are generated using OpenAI's text-embedding model
  • Deleted documents cannot be recovered
Weekly Installs
0
First Seen
Jan 1, 1970