haystack
SKILL.md
Haystack
This skill is for building intelligent search and retrieval systems with Haystack, combining document processing, semantic search, and LLM integration.
Overview
This skill provides guidance for working with Haystack 2.0+, a modern framework for building production-grade RAG and search applications. Haystack enables building systems that combine:
- Document processing and indexing
- Semantic and hybrid search capabilities
- LLM integration for question-answering and reasoning
- Multi-stage pipelines for complex workflows
- Vector and keyword-based retrieval strategies
When to use
Use this skill when the user is working on:
- Building RAG Systems: Creating retrieval-augmented generation pipelines that combine document search with LLM reasoning
- Document Search & Indexing: Implementing semantic or hybrid search over document collections
- Question-Answering Systems: Building QA systems that retrieve relevant context and answer questions
- LLM Integration: Connecting language models with retrieval systems for grounded responses
- Pipeline Development: Creating multi-stage processing workflows with Haystack components
- Document Processing: Preparing, chunking, and indexing documents for retrieval
- Vector Store Setup: Configuring and managing document embeddings and vector databases
Core Concepts
Key Haystack Components
- DocumentStore: Storage backends for documents (ElasticsearchDocumentStore, InMemoryDocumentStore, WeaviateDocumentStore, etc.)
- Retriever: Components that fetch relevant documents (BM25Retriever, EmbeddingRetriever, HybridRetriever)
- Pipeline: DAG-based orchestration of components
- Generators/Answerers: LLM-powered components that generate responses
- Preprocessors: Components for text chunking, cleaning, and normalization
Architecture Patterns
1. Simple RAG Pipeline
Query → Retriever → LLM Generator → Answer
2. Hybrid Search Pipeline
Query → BM25 Retriever ⟶ Embedding Retriever ⟶ Document Joiner → LLM Generator → Answer
3. Multi-Stage Retrieval
Query → Dense Retriever → Reranker → LLM Generator → Answer
Setup & Installation
Prerequisites
- Python 3.8+
- pip or conda package manager
- LLM API access (OpenAI, HuggingFace, Cohere, etc.) for generation tasks
Basic Installation
pip install haystack-ai
Optional Dependencies by Use Case
# For HuggingFace models
pip install sentence-transformers
Common Usage Patterns
Pattern 1: Setting Up a Document Store
- Initialize the appropriate DocumentStore type
- Configure indexing settings and metadata fields
- Prepare documents with required structure
- Handle document chunking and preprocessing
Pattern 2: Building a Retriever
- Choose retriever type (BM25, Embedding, Hybrid)
- Configure similarity metrics and thresholds
- Set up embedding model if using dense retrieval
- Optimize for recall vs latency trade-offs
Pattern 3: Creating a Pipeline
- Define component connections
- Handle component inputs/outputs
- Add error handling and validation
- Implement caching for performance
Pattern 4: Integrating with LLMs
- Choose LLM provider and model
- Configure prompt templates for context injection
- Handle token limits and chunking
- Implement fallback strategies
Best Practices
Document Preparation
- Split documents into appropriate chunk sizes (typically 100-500 tokens)
- Preserve metadata for filtering and ranking
- Normalize text format and encoding
- Handle special characters and multi-language content
Retrieval Optimization
- Use hybrid retrieval (BM25 + semantic) for better coverage
- Implement result deduplication
- Consider reranking for top-k results
- Monitor and tune retrieval parameters
LLM Integration
- Use system prompts to guide answer generation
- Include relevant metadata in prompts
- Implement result validation
- Handle empty or low-confidence results gracefully
Performance & Scaling
- Profile and optimize hot paths
- Use batch processing for document indexing
- Cache embeddings appropriately
- Monitor resource usage in production
Troubleshooting Common Issues
Retrieval Not Finding Relevant Documents
- Verify document preprocessing and chunk sizes
- Check embedding model dimensions and compatibility
- Consider hybrid search approach
- Review metadata filtering logic
Low Quality Answers
- Improve retrieved context relevance
- Adjust prompt templates
- Implement reranking strategies
- Use better embedding models
Performance Issues
- Split into seperate pipelines
- Use async processing where possible
- Optimize document store queries
- Consider batch processing
Example Workflows
Workflow 1: Building a Simple Document QA System
- Prepare and index documents
- Create retriever for document search
- Connect retriever to LLM generator
- Test with example questions
Workflow 2: Implementing Hybrid Search
- Set up BM25 component for keyword search
- Configure embedding model and dense retriever
- Combine results using document merger
- Add ranking/reranking if needed
Workflow 3: Production RAG System
- Design document chunking strategy
- Set up vector database (Elasticsearch, Weaviate, etc.)
- Implement multi-stage retrieval
- Add evaluation and monitoring
Key Integration Points
- LLM Providers: OpenAI, HuggingFace, Cohere, Ollama
- Vector Databases: Elasticsearch, Weaviate, Pinecone, FAISS
- Document Stores: In-memory, Elasticsearch, Qdrant
- Embedding Models: Sentence Transformers, OpenAI embeddings
- Evaluation Tools: Haystack's evaluation suite, Ragas
Resources
- Official Documentation: https://docs.haystack.deepset.ai/
- GitHub Repository: https://github.com/deepset-ai/haystack
- Examples: https://github.com/deepset-ai/haystack-cookbook/
Weekly Installs
4
Repository
srini047/haystack-skillsGitHub Stars
1
First Seen
5 days ago
Security Audits
Installed on
cline4
gemini-cli4
github-copilot4
codex4
kimi-cli4
cursor4