AutoRAG-Query: Text2SQL Agent Skill

Query AutoRAG pipeline results with natural language. Converts to SQL, executes safely, returns tables/JSON/CSV.

Quick Example

User: "Which pipeline has the best BLEU score?"

Agent:

Read references/schema.sql (understand tables)

Generate SQL:

SELECT p.name, s.metric_result
FROM summary s
JOIN pipeline p ON s.pipeline_id = p.id
JOIN metric m ON s.metric_id = m.id
WHERE m.name = 'bleu'
ORDER BY s.metric_result DESC LIMIT 1;

Execute: uv run python .agents/skills/autorag-query/scripts/query_executor.py --query "..."
Present: "hybrid_search_v2 has best BLEU: 0.85"

Workflow

Parse intent: What data? (metrics/pipelines/queries) What operation? (rank/aggregate/filter)
Load schema: Read references/schema.sql - key tables:
- summary: Aggregated pipeline metrics (best for rankings)
- evaluation_result: Per-query scores (detailed analysis)
- executor_result: Generation outputs with token_usage JSONB
- chunk_retrieved_result: Retrieval scores/ranks
Generate SQL following rules:
- ✅ SELECT-only, ⛔ Never: INSERT/UPDATE/DELETE/DROP/CREATE
- ⛔ Exclude vector columns: embedding, embeddings, bm25_tokens (cause type errors)
- Add LIMIT 100 if not specified
- Use JOINs: query_id → query.id, pipeline_id → pipeline.id, metric_id → metric.id
- JSONB: token_usage->>'field' (text) or (token_usage->>'field')::int (cast)
Execute: uv run python .agents/skills/autorag-query/scripts/query_executor.py --query "..." [--format json|csv|table]
Present: Summarize findings, show table, highlight insights

Key Tables

Table	Purpose	Key Columns
`pipeline`	Pipeline definitions	`id`, `name`, `pipeline_type`
`metric`	Metric definitions	`id`, `name`, `metric_type` (retrieval/generation)
`query`	Search queries	`id`, `query`, `ground_truths`, `dataset_name`
`executor_result`	Generation outputs	`query_id`, `pipeline_id`, `generation_result`, `token_usage` (JSONB), `execution_time`
`evaluation_result`	Per-query scores	`query_id`, `pipeline_id`, `metric_id`, `metric_result`
`summary`	Aggregated metrics	`pipeline_id`, `metric_id`, `metric_result`
`chunk_retrieved_result`	Retrieval outputs	`query_id`, `pipeline_id`, `chunk_id`, `score`, `rank`

Relationships: query_id → query.id, pipeline_id → pipeline.id, metric_id → metric.id, chunk_id → chunk.id

Common Queries

See references/common-queries.md for 20+ templates.

Pipeline ranking:

SELECT p.name, s.metric_result
FROM summary s
JOIN pipeline p ON s.pipeline_id = p.id
JOIN metric m ON s.metric_id = m.id
WHERE m.name = 'bleu'
ORDER BY s.metric_result DESC;

Token usage:

SELECT p.name,
       SUM((exe.token_usage->>'total_tokens')::int) AS total_tokens,
       AVG((exe.token_usage->>'total_tokens')::int) AS avg_per_query
FROM executor_result exe
JOIN pipeline p ON exe.pipeline_id = p.id
WHERE exe.token_usage IS NOT NULL
GROUP BY p.name
ORDER BY total_tokens DESC;

Retrieval results:

SELECT c.content, crr.score, crr.rank
FROM chunk_retrieved_result crr
JOIN chunk c ON crr.chunk_id = c.id
WHERE crr.query_id = :query_id AND crr.pipeline_id = :pipeline_id
ORDER BY crr.rank LIMIT 10;

JSONB Extraction

executor_result.token_usage:

{"prompt_tokens": 150, "completion_tokens": 50, "total_tokens": 200}

Extract:

Text: token_usage->>'prompt_tokens' → "150"
Integer: (token_usage->>'total_tokens')::int → 200
JSON: token_usage->'embedding_tokens' → preserves type

pipeline.config: config->>'model' → "gpt-4"

Critical Rules

⛔ Always exclude: embedding, embeddings, bm25_tokens columns (cause type errors)
✅ SELECT-only: Script validates and rejects DDL/DML
📏 Add LIMIT: Prevent large result sets
🔗 Use JOINs: Connect via foreign keys
⚡ Timeout: 10s default (add WHERE filters if slow)

Script Usage

uv run python .agents/skills/autorag-query/scripts/query_executor.py \
  --query "SELECT ..." \
  --format table|json|csv \
  --timeout 10 \
  --limit 10000 \
  --database autorag_research  # optional

Connection: Auto-loads from configs/db.yaml or POSTGRES_* env vars using DBConnection class.

Output formats:

table: ASCII table (default)
json: JSON array
csv: CSV with headers

Row count: Printed to stderr: (N rows)

Error Handling

Error	Cause	Fix
"Forbidden keyword"	Non-SELECT query	Use SELECT-only
"Vector type error"	Selected vector columns	Exclude `embedding`, `embeddings`, `bm25_tokens` from SELECT
"Query timeout"	Query too slow	Add WHERE/LIMIT
"Connection failed"	Missing credentials	Check `configs/db.yaml` or set env vars

Advanced: Window Functions & Pivots

Ranking:

SELECT p.name, m.name, s.metric_result,
       RANK() OVER (PARTITION BY m.name ORDER BY s.metric_result DESC) AS rank
FROM summary s
JOIN pipeline p ON s.pipeline_id = p.id
JOIN metric m ON s.metric_id = m.id;

Pivot:

SELECT p.name,
       MAX(CASE WHEN m.name = 'bleu' THEN s.metric_result END) AS bleu,
       MAX(CASE WHEN m.name = 'rouge' THEN s.metric_result END) AS rouge
FROM summary s
JOIN pipeline p ON s.pipeline_id = p.id
JOIN metric m ON s.metric_id = m.id
GROUP BY p.name;

References

Schema: references/schema.sql - full DB schema with comments
Templates: references/common-queries.md - 20+ query examples
Executor: scripts/query_executor.py - safe SQL execution script

Installation: Works from .agents/skills/autorag-query/ (auto-detected by agents).