valyu-best-practices
Valyu Best Practices
This skill provides instructions for using the Valyu API to perform search, content extraction, AI-powered answers, and deep research tasks.
Quick Reference: Choosing the Right API
Use this decision tree to select the appropriate Valyu API:
What do you need?
├─ Find information across multiple sources
│ └─ Use Search API
│
├─ Extract content from specific URLs
│ └─ Use Contents API
│
├─ Get an AI-synthesized answer with citations
│ └─ Use Answer API
│
├─ Generate a comprehensive research report
│ └─ Use DeepResearch API
│
└─ Discover available data sources
└─ Use Datasources API
⚠️ MANDATORY: Use Official Valyu SDK Libraries
CRITICAL: When writing code that uses the Valyu API, you MUST use the official SDK libraries. NEVER make raw HTTP/fetch calls to the Valyu API endpoints.
JavaScript/TypeScript: valyu-js
npm install valyu-js
# or
pnpm add valyu-js
import { Valyu } from 'valyu-js';
const valyu = new Valyu(process.env.VALYU_API_KEY);
// Now use valyu.search(), valyu.contents(), valyu.answer(), valyu.deepResearch
Python: valyu
pip install valyu
# or
uv add valyu
from valyu import Valyu
valyu = Valyu(api_key=os.environ.get("VALYU_API_KEY"))
# Now use valyu.search(), valyu.contents(), valyu.answer(), valyu.deep_research
Why SDK Over Raw API Calls?
- Type safety - Full TypeScript/Python type hints for all parameters and responses
- Automatic retries - Built-in retry logic for transient failures
- Streaming support - Proper async iterator support for streaming responses
- Error handling - Structured error types with helpful messages
- Future compatibility - SDK updates handle API changes automatically
❌ NEVER Do This
// DON'T make raw fetch calls
const response = await fetch('https://api.valyu.ai/v1/search', {
method: 'POST',
headers: {
'x-api-key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({ query: '...' })
});
✅ Always Do This
// DO use the SDK
import { Valyu } from 'valyu-js';
const valyu = new Valyu(process.env.VALYU_API_KEY);
const response = await valyu.search({ query: '...' });
1. Search API
Purpose: Find information across web, academic, medical, transportation, financial, news, and proprietary sources.
When to Use
- Finding recent information on any topic
- Academic research (arXiv, PubMed, bioRxiv, medRxiv)
- Financial data (SEC filings, earnings reports, stock data)
- News monitoring and current events
- Healthcare data (clinical trials, drug labels)
- Prediction markets (Polymarket, Kalshi)
- Transportation (UK National Rail, Global Shipping)
Basic Usage
const response = await valyu.search({
query: "transformer architecture attention mechanism 2024",
searchType: "all",
maxNumResults: 10
});
Search Types
| Type | Use For |
|---|---|
all |
Everything - web, academic, financial, proprietary |
web |
General internet content only |
proprietary |
Licensed academic papers and research |
news |
News articles and current events |
Key Parameters
| Parameter (TS/JS) | Parameter (Python) | Purpose | Example |
|---|---|---|---|
query |
query |
Search query (under 400 chars) | "CRISPR gene editing 2024" |
searchType |
search_type |
Source scope | "all", "web", "proprietary", "news" |
maxNumResults |
max_num_results |
Number of results (1-20) | 10 |
includedSources |
included_sources |
Limit to specific sources | ["valyu/valyu-arxiv", "valyu/valyu-pubmed"] |
startDate / endDate |
start_date / end_date |
Date filtering | "2024-01-01" |
relevanceThreshold |
relevance_threshold |
Minimum relevance (0-1) | 0.7 |
Domain-Specific Search Patterns
Academic Research:
await valyu.search({
query: "CRISPR therapeutic applications clinical trials",
searchType: "proprietary",
includedSources: ["valyu/valyu-arxiv", "valyu/valyu-pubmed", "valyu/valyu-biorxiv"],
startDate: "2024-01-01"
});
Financial Analysis:
await valyu.search({
query: "Apple revenue Q4 2024 earnings",
searchType: "all",
includedSources: ["valyu/valyu-sec-filings", "valyu/valyu-earnings-US"]
});
News Monitoring:
await valyu.search({
query: "AI regulation EU",
searchType: "news",
startDate: "2024-06-01",
countryCode: "EU"
});
Search Recipes
For detailed patterns, see:
2. Contents API
Purpose: Extract clean, structured content from web pages optimized for LLM processing.
When to Use
- Converting web pages to clean markdown
- Extracting article text for summarization
- Parsing documentation for RAG systems
- Structured data extraction from product pages
- Processing academic papers
Basic Usage
const response = await valyu.contents({
urls: ["https://example.com/article"]
});
With Summarization
const response = await valyu.contents({
urls: ["https://arxiv.org/abs/2401.12345"],
summary: "Extract key findings in 3 bullet points"
});
Structured Extraction (JSON Schema)
const response = await valyu.contents({
urls: ["https://example.com/product"],
summary: {
type: "object",
properties: {
product_name: { type: "string" },
price: { type: "number" },
features: { type: "array", items: { type: "string" } }
},
required: ["product_name", "price"]
}
});
Key Parameters
| Parameter (TS/JS) | Parameter (Python) | Purpose | Example |
|---|---|---|---|
urls |
urls |
URLs to process (1-10) | ["https://example.com"] |
responseLength |
response_length |
Content length | "short", "medium", "large", "max" |
extractEffort |
extract_effort |
Extraction quality | "normal", "high", "auto" |
summary |
summary |
AI summarization | true, "instructions", or JSON schema |
screenshot |
screenshot |
Capture screenshots | true |
Content Recipes
For detailed patterns, see:
3. Answer API
Purpose: Get AI-powered answers grounded in real-time search results with citations.
When to Use
- Questions requiring current information synthesis
- Multi-source fact verification
- Technical documentation questions
- Research requiring cited sources
- Structured data extraction from search results
Basic Usage
const response = await valyu.answer({
query: "What are the latest developments in quantum computing?"
});
With Fast Mode (Lower Latency)
const response = await valyu.answer({
query: "Current Bitcoin price and 24h change",
fastMode: true
});
With Custom Instructions
const response = await valyu.answer({
query: "Compare React and Vue for enterprise applications",
systemInstructions: "Provide a balanced comparison with pros and cons. Format as a comparison table."
});
With Streaming
const stream = await valyu.answer({
query: "Explain transformer architecture",
streaming: true
});
for await (const chunk of stream) {
// Handle: search_results, content, metadata, done, error
console.log(chunk);
}
Structured Output
const response = await valyu.answer({
query: "Apple Q4 2024 financial highlights",
structuredOutput: {
type: "object",
properties: {
revenue: { type: "string" },
growthRate: { type: "string" },
keyHighlights: { type: "array", items: { type: "string" } }
}
}
});
Key Parameters
| Parameter (TS/JS) | Parameter (Python) | Purpose | Example |
|---|---|---|---|
query |
query |
Question to answer | "What is quantum computing?" |
fastMode |
fast_mode |
Lower latency | true |
systemInstructions |
system_instructions |
AI directives | "Be concise" |
structuredOutput |
structured_output |
JSON schema | {type: "object", ...} |
streaming |
streaming |
Enable SSE streaming | true |
dataMaxPrice |
data_max_price |
Dollar limit | 1.0 |
Answer Recipes
For detailed patterns, see:
4. DeepResearch API
Purpose: Generate comprehensive research reports with detailed analysis and citations.
When to Use
- Comprehensive market analysis
- Literature reviews
- Competitive intelligence
- Technical deep dives
- Topics requiring multi-source synthesis
Research Modes
| Mode | Duration | Best For |
|---|---|---|
fast |
~5 minutes | Quick lookups, simple questions |
standard |
~10-20 minutes | Balanced research (most common) |
heavy |
~90 minutes | Comprehensive analysis, complex topics |
Create Research Task
const task = await valyu.deepResearch.create({
query: "AI chip market competitive landscape 2024",
model: "standard"
});
// Returns: { deepresearch_id: "abc123", status: "queued" }
Poll for Completion
const status = await valyu.deepResearch.getStatus(task.deepresearch_id);
// status: "queued" | "running" | "completed" | "failed" | "cancelled"
if (status.status === "completed") {
console.log(status.output); // Markdown report
console.log(status.sources); // Cited sources
console.log(status.pdf_url); // PDF download link
}
Key Parameters
| Parameter (TS/JS) | Parameter (Python) | Purpose | Example |
|---|---|---|---|
query |
query |
Research question | "AI market trends 2024" |
model |
model |
Research depth | "fast", "standard", "heavy" |
outputFormat |
output_format |
Report format | "markdown", "pdf" |
includedSources |
included_sources |
Source filtering | ["valyu/valyu-arxiv", "techcrunch.com"] |
startDate / endDate |
start_date / end_date |
Date range | "2024-01-01" |
DeepResearch Recipes
For detailed patterns, see:
5. Query Writing Best Practices
Core Principles
- Be specific - Use domain terminology
- Be concise - Keep queries under 400 characters
- Be focused - One topic per query
- Add constraints - Include timeframes, source types
Query Anatomy
| Element | Description | Example |
|---|---|---|
| Intent | What you need | "latest advancements" vs "overview" |
| Domain | Topic terminology | "transformer architecture" |
| Constraints | Filters | "2024", "peer-reviewed" |
| Source type | Where to look | academic papers, SEC filings |
Good vs Bad Queries
BAD: "I want to know about AI"
GOOD: "transformer attention mechanism survey 2024"
BAD: "Apple financial information"
GOOD: "Apple revenue growth Q4 2024 earnings SEC filing"
BAD: "gene editing research"
GOOD: "CRISPR off-target effects therapeutic applications 2024"
Split Complex Requests
# Don't do this
"Tesla stock performance, new products, and Elon Musk statements"
# Do this instead
Query 1: "Tesla stock performance Q4 2024"
Query 2: "Tesla Cybertruck production updates 2024"
Query 3: "Tesla FSD autonomous driving progress"
Source Filtering
Use includedSources for domain authority:
Financial Research Collection. Some sources to include:
valyu/valyu-sec-filings- SEC regulatory filingsvalyu/valyu-stocks- Stock market datavalyu/valyu-earnings-US- Earnings reportsreuters.com- Financial newsbloomberg.com- Market analysis Medical Research Collection. Some sources to include:valyu/valyu-pubmed- Medical literaturevalyu/valyu-clinical-trials- Clinical trial datavalyu/valyu-drug-labels- FDA drug informationnejm.org- New England Journal of Medicinethelancet.com- The Lancet Tech Documentation Collection. Some sources to include:docs.aws.amazon.com- AWS documentationcloud.google.com/docs- Google Cloud docslearn.microsoft.com- Microsoft docskubernetes.io/docs- Kubernetes docsdeveloper.mozilla.org- MDN Web Docs
// Academic
includedSources: ["valyu/valyu-arxiv", "valyu/valyu-pubmed", "nature"]
// Financial
includedSources: ["valyu/valyu-sec-filings", "bloomberg.com", "reuters.com"]
// Tech news
includedSources: ["techcrunch.com", "theverge.com", "arstechnica.com"]
For complete prompting guide, see references/prompting.md.
6. Common Workflows
Research Workflow
// 1. Quick search to find sources
const searchResults = await valyu.search({
query: "CRISPR therapeutic applications",
searchType: "proprietary",
maxNumResults: 20
});
// 2. Extract key content from top results
const contents = await valyu.contents({
urls: searchResults.results.slice(0, 3).map(r => r.url),
summary: "Extract key findings"
});
// 3. Deep analysis for comprehensive report
const research = await valyu.deepResearch.create({
query: "CRISPR therapeutic applications comprehensive review",
model: "heavy"
});
Financial Analysis Workflow
// 1. Get SEC filings
const filings = await valyu.search({
query: "Apple 10-K 2024",
includedSources: ["valyu/valyu-sec-filings"]
});
// 2. Quick synthesis
const summary = await valyu.answer({
query: "Apple Q4 2024 financial highlights",
fastMode: true
});
// 3. Structured extraction
const metrics = await valyu.answer({
query: "Apple financial metrics 2024",
structuredOutput: {
type: "object",
properties: {
revenue: { type: "string" },
netIncome: { type: "string" },
growthRate: { type: "string" }
}
}
});
7. Available Data Sources
Valyu provides access to 25+ specialized datasets:
| Category | Examples |
|---|---|
| Academic | arXiv (2.5M+ papers), PubMed (37M+), bioRxiv, medRxiv |
| Financial | SEC filings, earnings transcripts, stock data, crypto |
| Healthcare | Clinical trials, DailyMed, PubChem, drug labels, ChEMBL, DrugBank, Open Target, WHO ICD |
| Economic | FRED, BLS, World Bank, US Treasury, Destatis |
| Predictions | Polymarket, Kalshi |
| Patents | US patent database |
| Transportation | UK Rail, Ship Tracking |
For complete datasource reference, see references/datasources.md.
8. API Reference
For complete API documentation including all parameters, response structures, and error codes, see references/api-guide.md.
9. Integration Guides
Platform-specific integration documentation: