Knowledge Retrieval (RAG)
Knowledge Retrieval (RAG)
Retrieval-Augmented Generation (RAG) connects an LLM to your data. Since LLMs have a cutoff date and don't know your private documents, RAG solves this by first searching for relevant information in a database and then pasting it into the prompt context. This grounds the answer in facts and reduces hallucinations.
When to Use
- Domain-Specific QA: Answering questions about internal documentation, legal contracts, or medical records.
- Dynamic Data: When the information changes frequently (news, stock analysis).
- Verifiability: When the answer must cite sources ("According to policy document A...").
- Cost: To avoid fine-tuning models on new data, which is expensive and slow.
Use Cases
- Enterprise Search: "How do I reset my VPN password?" (Searches IT Wiki).
- Legal Analysis: "Summarize the liability clause in this contract." (Searches contract PDF).
- Customer Support: "What is my order status?" (Searches Order Database).
Implementation Pattern
def rag_workflow(user_query):
# Step 1: Retrieval
# Convert query to vector and search vector DB
relevant_docs = vector_db.similarity_search(user_query, k=3)
# Step 2: Prompt Construction
# Combine the context with the user question
context_text = "\n".join([doc.content for doc in relevant_docs])
prompt = f"""
You are a helpful assistant. Answer the question based ONLY on the context below.
Context:
{context_text}
Question: {user_query}
"""
# Step 3: Generation
answer = llm.generate(prompt)
return answer
More from lauraflorentin/skills-marketplace
multi-agent-collaboration
A structural pattern where multiple specialized agents communicate and coordinate to solve a problem that is too complex for a single agent. Use when user asks to "build a multi-agent system", "agents working together", "agent collaboration", or mentions team of agents, distributed agents, or swarm.
21reflection
A recursive pattern where an agent evaluates and critiques its own output to iteratively improve quality and catch errors. Use when user asks to "add self-reflection", "agent introspection", "self-critique", or mentions self-evaluation, meta-cognition, or quality self-assessment.
18human-in-the-loop
A hybrid pattern where the system pauses execution to request human approval, input, or disambiguation before proceeding with critical actions. Use when user asks to "add human approval", "require human review", "human-in-the-loop", or mentions approval workflows, human oversight, or escalation.
16planning
A high-level cognitive pattern where an agent formulates a structured sequence of actions (a plan) before executing any of them, ensuring goal-directed behavior. Use when user asks to "add planning to my agent", "task planning", "agent planning", or mentions plan generation, plan execution, or step-by-step planning.
14parallelization
A concurrency pattern where multiple agent tasks are executed at the same time to speed up processing or gather diverse perspectives. Use when user asks to "run agents in parallel", "parallelize tasks", "concurrent execution", or mentions parallel processing, fan-out, or batch execution.
13routing
A control flow pattern where a central component classifies an input request and directs it to the most appropriate specialized agent or tool. Use when user asks to "route between agents", "agent routing", "task dispatch", or mentions classifier routing, intent detection, or agent selection.
12