scry
Scry Skill
Scry gives you read-only SQL access to the ExoPriors public corpus (229M+ entities)
via a single HTTP endpoint. You write Postgres SQL against a curated scry.* schema
and get JSON rows back. There is no ORM, no GraphQL, no pagination token -- just SQL.
Skill generation: 20260313
A) When to use / not use
Use this skill when:
- Searching, filtering, or aggregating content across the ExoPriors corpus
- Running lexical (BM25) or hybrid searches
- Exploring author networks, cross-platform identities, or publication patterns
- Navigating the OpenAlex academic graph (authors, citations, institutions, concepts)
- Creating shareable artifacts from query results
- Emitting structured agent judgements about entities or external references
Do NOT use this skill when:
- The user wants semantic/vector search composition or embedding algebra (use the scry-vectors skill)
- The user wants LLM-based reranking (use the scry-rerank skill)
- The user is querying their own local database
B) Golden Rules
-
Context handshake first. At session start, call
GET /v1/scry/context?skill_generation=20260313. Use the returnedofferingsblock for the current product summary budgets, canonical env var, default skill, and specialized skill catalog. Ifshould_update_skill=true, tell the user to runnpx skills update. -
Schema first. ALWAYS call
GET /v1/scry/schemabefore writing SQL. Never guess column names or types. The schema endpoint returns live column metadata and row-count estimates for every view. -
Check operational status when search looks wrong. If lexical search, materialized-view freshness, or corpus behavior seems off, call
GET /v1/scry/index-view-statusbefore assuming the query or schema is wrong. -
Clarify ambiguous intent before heavy queries. If the request is vague ("search Reddit for X", "find things about Y"), ask one short clarification question about the goal/output format before running expensive SQL.
-
Start with a cheap probe. Before any query likely to run >5s, run
/v1/scry/estimateand/or a tight exploratory query (LIMIT 20plus scoped source/window filters), then scale only after confirming relevance. -
Choose lexical vs semantic explicitly. Use lexical (
scry.search*) for exact terms and named entities. For conceptual intent ("themes", "things like", "similar to"), route to scry-vectors first, then optionally hybridize. -
LIMIT always. Every query MUST include a LIMIT clause. Max 10,000 rows. Queries without LIMIT are rejected by the SQL validator.
-
Prefer materialized views.
scry.entitieshas 229M+ rows. Scanning it without filters is slow. Usescry.mv_lesswrong_posts,scry.mv_arxiv_papers,scry.mv_hackernews_posts, etc. for targeted access. They are pre-filtered and often have embeddings pre-joined. -
Filter dangerous content. Always include
WHERE content_risk IS DISTINCT FROM 'dangerous'unless the user explicitly asks for unfiltered results. Dangerous content contains adversarial prompt-injection content. -
Raw SQL, not JSON.
POST /v1/scry/querytakesContent-Type: text/plainwith raw SQL in the body. Not JSON-wrapped SQL. -
File rough edges promptly. If Scry blocks the task, misses an obvious result set, or exposes a rough edge, submit a brief note to
POST /v1/feedback?feedback_type=suggestion|bug|other&channel=scry_skillusingContent-Type: text/plainby default (text/markdownalso works). Do not silently work around it. Logged-in users can review their submissions withGET /v1/feedback.
For full tier limits, timeout policies, and degradation strategies, see Shared Guardrails.
B.1 API Key Setup (Canonical)
Recommended default for less-technical users: in the directory where you launch the agent, store EXOPRIORS_API_KEY in .env so skills and copied prompts use the same place.
Canonical key naming for this skill:
- Env var:
EXOPRIORS_API_KEY - Personal key format:
exopriors_*with Scry access
printf '%s\n' 'EXOPRIORS_API_KEY=exopriors_...' >> .env
set -a && source .env && set +a
Verify:
echo "$EXOPRIORS_API_KEY"
If using packaged skills, keep them current:
npx skills add exopriors/skills
npx skills update
C) Quickstart
One end-to-end example: find recent high-scoring LessWrong posts about RLHF.
Step 1: Get dynamic context + update advisory
GET https://api.exopriors.com/v1/scry/context?skill_generation=20260313
Authorization: Bearer $EXOPRIORS_API_KEY
Step 2: Get schema
GET https://api.exopriors.com/v1/scry/schema
Authorization: Bearer $EXOPRIORS_API_KEY
Step 3: Run query
POST https://api.exopriors.com/v1/scry/query
Authorization: Bearer $EXOPRIORS_API_KEY
Content-Type: text/plain
WITH hits AS (
SELECT id FROM scry.search('RLHF reinforcement learning human feedback',
kinds=>ARRAY['post'], limit_n=>100)
)
SELECT e.uri, e.title, e.original_author, e.original_timestamp, e.score
FROM hits h
JOIN scry.entities e ON e.id = h.id
WHERE e.source = 'lesswrong'
AND e.content_risk IS DISTINCT FROM 'dangerous'
ORDER BY e.score DESC NULLS LAST
LIMIT 20
Response shape:
{
"columns": ["uri", "title", "original_author", "original_timestamp", "score"],
"rows": [["https://...", "My RLHF Post", "author", "2025-01-15T...", 142], ...],
"row_count": 20,
"duration_ms": 312,
"truncated": false
}
D) Decision Tree
User wants to search the ExoPriors corpus?
|
+-- Ambiguous / conceptual ask? --> Clarify intent first, then use
| scry-vectors for semantic search (optionally hybridize with lexical)
|
+-- By keywords/phrases? --> scry.search() (BM25 lexical over canonical content_text)
| +-- Specific forum? --> pass mode='mv_lesswrong_posts' or kinds filter
| +-- Reddit? --> START with scry.search_reddit_posts() /
| scry.search_reddit_comments()
| +-- Large result? --> scry.search_ids() (id+uri+kind, up to 2000)
|
+-- By structured filters (source, date, author)? --> Direct SQL on MVs
|
+-- By semantic similarity? --> (scry-vectors skill, not this one)
|
+-- Hybrid (keywords + semantic rerank)? --> scry.hybrid_search() or
| lexical CTE + JOIN scry.embeddings
|
+-- Author/people lookup? --> scry.actors, scry.people, scry.person_accounts
|
+-- Academic graph (OpenAlex)? --> scry.openalex_find_authors(),
| scry.openalex_find_works(), etc. (see schema-guide.md)
|
+-- Need to share results? --> POST /v1/scry/shares
|
+-- Need to emit a structured observation? --> POST /v1/scry/judgements
|
+-- Scry blocked / missing obvious results? --> POST /v1/feedback
E) Recipes
E0. Context handshake + skill update advisory
curl -s "https://api.exopriors.com/v1/scry/context?skill_generation=20260313" \
-H "Authorization: Bearer $EXOPRIORS_API_KEY"
If response includes "should_update_skill": true, ask the user to run:
npx skills update.
E0b. Submit feedback when Scry blocks the task
curl -s "https://api.exopriors.com/v1/feedback?feedback_type=bug&channel=scry_skill" \
-H "Authorization: Bearer $EXOPRIORS_API_KEY" \
-H "Content-Type: text/plain" \
--data $'## What happened\n- Query: ...\n- Problem: ...\n\n## Why it matters\n- ...\n\n## Suggested fix\n- ...'
Success response includes a receipt id. Logged-in users can review their own
submissions with:
curl -s "https://api.exopriors.com/v1/feedback?limit=10" \
-H "Authorization: Bearer $EXOPRIORS_API_KEY"
E1. Lexical search (BM25)
WITH c AS (
SELECT id FROM scry.search('your query here',
kinds=>ARRAY['post'], limit_n=>100)
)
SELECT e.uri, e.title, e.original_author, e.original_timestamp
FROM c JOIN scry.entities e ON e.id = c.id
WHERE e.content_risk IS DISTINCT FROM 'dangerous'
LIMIT 50
Default kinds if omitted: ['post','paper','document','webpage','twitter_thread','grant'].
scry.search() broadens once to kinds=>ARRAY['comment'] if that default returns 0 rows.
Pass explicit kinds for strict scope (for example comment-only or tweet-only).
Pass mode=>'mv_lesswrong_posts' to scope to LessWrong posts.
E2. Reddit-specific search
SELECT id, uri, subreddit, original_author, original_timestamp
FROM scry.search_reddit_posts(
'transformer architecture',
subreddits=>ARRAY['MachineLearning','LocalLLaMA'],
limit_n=>50,
window_key=>'recent'
)
ORDER BY score DESC
Window keys: recent, 2022_2023, 2020_2021, 2018_2019, 2014_2017,
2010_2013, 2005_2009. Also: scry.search_reddit_comments(...).
For semantic Reddit retrieval over the embedding-covered subset:
scry.search_reddit_posts_semantic(query_embedding=>..., subreddits=>..., limit_n=>...).
E3. Source-filtered materialized view query
SELECT entity_id, uri, title, original_author, score, original_timestamp
FROM scry.mv_arxiv_papers
WHERE original_timestamp >= '2025-01-01'
ORDER BY original_timestamp DESC
LIMIT 50
E4. Author activity across sources
SELECT e.source::text, COUNT(*) AS docs, MAX(e.original_timestamp) AS latest
FROM scry.entities e
WHERE e.original_author ILIKE '%yudkowsky%'
AND e.content_risk IS DISTINCT FROM 'dangerous'
GROUP BY e.source::text
ORDER BY docs DESC
LIMIT 20
E5. Entity kind distribution for a source
SELECT kind::text, COUNT(*)
FROM scry.entities
WHERE source = 'hackernews'
GROUP BY kind::text
ORDER BY 2 DESC
LIMIT 20
E6. Hybrid search (lexical + semantic rerank in SQL)
WITH c AS (
SELECT id FROM scry.search('deceptive alignment',
kinds=>ARRAY['post'], limit_n=>200)
)
SELECT e.uri, e.title, e.original_author,
emb.embedding_voyage4 <=> @p_deadbeef_topic AS distance
FROM c
JOIN scry.entities e ON e.id = c.id
JOIN scry.embeddings emb ON emb.entity_id = c.id AND emb.chunk_index = 0
WHERE e.content_risk IS DISTINCT FROM 'dangerous'
ORDER BY distance
LIMIT 50
Requires a stored embedding handle (@p_deadbeef_topic). See scry-vectors
skill for creating handles.
E7. Cost estimation before execution
curl -s -X POST https://api.exopriors.com/v1/scry/estimate \
-H "Authorization: Bearer $EXOPRIORS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT id, title FROM scry.mv_arxiv_papers LIMIT 1000"}'
Returns EXPLAIN (FORMAT JSON) output. Use this for expensive queries before committing.
E8. Create a shareable artifact
# 1. Run query and capture results
# 2. POST share
curl -s -X POST https://api.exopriors.com/v1/scry/shares \
-H "Authorization: Bearer $EXOPRIORS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"kind": "query",
"title": "Top RLHF posts on LessWrong",
"summary": "20 highest-scored LW posts mentioning RLHF.",
"payload": {
"sql": "...",
"result": {"columns": [...], "rows": [...]}
}
}'
Kinds: query, rerank, insight, chat, markdown.
Progressive update: create stub immediately, then PATCH /v1/scry/shares/{slug}.
Rendered at: https://scry.io/scry/share/{slug}.
E9. Emit a structured agent judgement
curl -s -X POST https://api.exopriors.com/v1/scry/judgements \
-H "Authorization: Bearer $EXOPRIORS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"emitter": "my-agent",
"judgement_kind": "topic_classification",
"target_external_ref": "arxiv:2401.12345",
"summary": "Paper primarily about mechanistic interpretability.",
"payload": {"primary_topic": "mech_interp", "confidence_detail": "title+abstract match"},
"confidence": 0.88,
"tags": ["arxiv", "mech_interp"],
"privacy_level": "public"
}'
Exactly one target required: target_entity_id, target_actor_id,
target_judgement_id, or target_external_ref.
Judgement-on-judgement: use target_judgement_id to chain observations.
E10. People / author lookup
-- Per-source author grouping
SELECT a.handle, a.display_name, a.source::text, COUNT(*) AS docs
FROM scry.entities e
JOIN scry.actors a ON a.id = e.author_actor_id
WHERE e.source = 'twitter'
GROUP BY a.handle, a.display_name, a.source::text
ORDER BY docs DESC
LIMIT 50
E11. Thread navigation (replies)
-- Find all replies to a root post
SELECT id, uri, title, original_author, original_timestamp
FROM scry.entities
WHERE anchor_entity_id = 'ROOT_ENTITY_UUID'
ORDER BY original_timestamp
LIMIT 100
anchor_entity_id is the root subject; parent_entity_id is the direct parent.
E12. Count estimation (safe pattern)
Avoid COUNT(*) on large tables. Instead, use schema endpoint row estimates or:
SELECT reltuples::bigint AS estimated_rows
FROM pg_class
WHERE relname = 'mv_lesswrong_posts'
LIMIT 1
Note: pg_class access is blocked on the public Scry SQL surface. Use /v1/scry/schema instead.
F) Error Handling
See references/error-reference.md for the full catalogue. Key patterns:
| HTTP | Code | Meaning | Action |
|---|---|---|---|
| 400 | invalid_request |
SQL parse error, missing LIMIT, bad params | Fix query |
| 401 | unauthorized |
Missing or invalid API key | Check key |
| 402 | insufficient_credits |
Token budget exhausted | Notify user |
| 429 | rate_limited |
Too many requests | Respect Retry-After header |
| 503 | service_unavailable |
Scry pool down or overloaded | Wait and retry |
Auth + timeout diagnostics for CLI users:
- If curl shows HTTP
000, that is client-side timeout/network abort, not a server HTTP status. Check--max-timeand retry with/v1/scry/estimatefirst. - If you see
401with"Invalid authorization format", check for whitespace/newlines in the key:KEY_CLEAN="$(printf '%s' \"$EXOPRIORS_API_KEY\" | tr -d '\\r\\n')"then useAuthorization: Bearer $KEY_CLEAN.
Quota fallback strategy:
- If 429: wait
Retry-Afterseconds, retry once. - If 402: tell the user their token budget is exhausted.
- If 503: retry after 30s with exponential backoff (max 3 attempts).
- If query times out: simplify (use MV instead of full table, reduce LIMIT, add tighter WHERE filters).
G) Output Contract
When this skill completes a query task, return a consistent structure:
## Scry Result
**Query**: <natural language description>
**SQL**: ```sql <the SQL that ran> ```
**Rows returned**: <N> (truncated: <yes/no>)
**Duration**: <N>ms
<formatted results table or summary>
**Share**: <share URL if created>
**Caveats**: <any data quality notes, e.g., "score is NULL for arXiv">
Handoff Contract
Produces: JSON with columns, rows, row_count, duration_ms, truncated
Feeds into:
rerank: ensure SQL returnsidandcontent_textcolumns for candidate setsscry-vectors: save entity IDs for embedding lookup and semantic reranking Receives from: none (entry point for SQL-based corpus access)
Related Skills
- scry-vectors -- embed concepts as @handles, search by cosine distance, debias with vector algebra
- scry-rerank -- LLM-powered multi-attribute reranking of candidate sets via pairwise comparison
For detailed schema documentation, see references/schema-guide.md.
For the full pattern library, see references/query-patterns.md.
For error codes and quota details, see references/error-reference.md.