setup-context
setup-context
Take the user from pip install nao-core to a synced project with a starter RULES.md.
Be brief. One batch of questions, then act. Don't ping-pong.
Scope ceiling: ≤100 tables. Above that, sync gets slow and per-table context budget gets thin. 20 is a great target.
Step 1 — Ask everything in one round
Send a single message asking for:
- Warehouse + auth — type (BigQuery / Snowflake / Postgres / Redshift / DuckDB / Databricks / Athena / ClickHouse / Fabric / MSSQL / MySQL / Trino), and the auth credentials they have on hand. Tell them you'll fetch the exact field names from the nao docs once they pick a type.
- Scope — which tables. Two valid shapes:
- Broad — gold/marts across multiple domains (exec / cross-functional agents).
- Deep — silver + gold for one domain (team-specific agents).
- Extra context — dbt / ETL / BI repos, Notion, internal docs. Ask for the SSH git URL of each repo (e.g.
git@github.com:org/repo.git) — sync clones them. No local paths. - LLM — provider and key. The model is selected in the nao UI, not in the config. Key comes later (Step 5).
Step 2 — Look up warehouse fields, write nao_config.yaml, run nao init
-
Fetch the warehouse-specific config from docs.getnao.io/nao-agent/context-builder/databases. Each warehouse has its own required and optional fields (e.g. BigQuery needs
project_id+dataset_id(optional); Snowflake needsaccount_id+warehouse+schema_name(optional); Postgres needshost+port+database+schema_name(optional). Ask the user for any required field you don't already have. -
Write
nao_config.yamlfrom the answers (skeleton in appendix below). -
Run
nao init— it detects the existing yaml and offers to update; confirm. Folder scaffold gets created. Say "no" to optional providers (skills / MCPs / Notion / Slack); edit the yaml directly afterwards if needed.Use this command (unsets leaked env vars from the parent agentic CLI — see Step 5):
unset ANTHROPIC_BASE_URL ANTHROPIC_API_KEY && source ~/.zshrc 2>/dev/null; nao init 2>&1 -
Print a summary of
nao_config.yamlto the user before going further. Format example:nao_config.yaml summary • project: <name> • warehouse: BigQuery (project=<id>, dataset=<id>, auth=service-account) • scope: include=["analytics.fct_*", "analytics.dim_*"], exclude=[] • templates: [columns, preview, description] • repos: company-dbt (git@github.com:org/company-dbt.git) • llm: anthropic (key via ${ANTHROPIC_API_KEY})The model is configured in the nao UI, not in
nao_config.yaml— don't include a model ID in the summary or the yaml.Ask the user to confirm before continuing. This is the last cheap chance to catch a wrong project, a misspelled dataset, or a missing repo.
Database templates field
Per database in the yaml, set:
templates: [columns, preview, description]
That's the set this skill ships. Other values are valid per-warehouse (how_to_use, profiling, ai_summary, and indexes for ClickHouse) — see the docs link above — but stick to [columns, preview, description] unless the user specifically asks otherwise.
Don't use accessors — deprecated (renamed to templates).
nao init creates: nao_config.yaml, empty RULES.md, .naoignore, and folders databases/, repos/, docs/, semantics/, queries/, tests/, agent/{tools,mcps,skills}/.
Step 3 — nao sync
After the user confirms the summary in Step 2:
cd <project> # where nao_config.yaml lives — every nao command runs from here
nao sync
Common failures: auth (fix yaml), tables not found (check schema casing), permission denied (grant read access), repo missing (fix repos: block, confirm SSH key). Don't move on until sync is clean.
Step 4 — Generate RULES.md (no confirmation)
Hand off directly to write-context-rules. Don't ask.
Step 5 — Wire up the LLM key
The key lives in nao_config.yaml. Two safe options:
- Preferred: env-var ref. Write
api_key: ${ANTHROPIC_API_KEY}; tell the user to export the key in their shell. - If they insist on a literal: tell them to edit the yaml themselves and add it to
.gitignore. Never ask them to paste a key into chat.
Then nao debug to confirm.
Known issue — AI_APICallError: Not Found
If nao chat / nao debug / nao test fails with that error and the URL is https://api.anthropic.com/messages (no /v1/), the parent agentic CLI (Claude Code, Cursor, Codex) is leaking ANTHROPIC_BASE_URL into the child env. Fix:
unset ANTHROPIC_BASE_URL ANTHROPIC_API_KEY && source ~/.zshrc 2>/dev/null; nao chat 2>&1
Regular human terminals aren't affected.
Step 6 — Recommend next steps
- Smoke test:
nao chat, ask 3-5 real questions. - Review
RULES.mdfor wrong inferences. - Pick a next skill:
write-context-rules(refine),create-context-tests(benchmark),audit-context(anytime),add-semantic-layer(only after tests reveal metric-reliability gaps).
Guardrails
cdinto the project directory before anynaocommand.- Cap at ~100 tables.
- One batch of questions. Look up warehouse-specific fields from the docs, don't keep pinging the user.
- Run
nao initnon-interactively with the yaml pre-written. - Use
templates: [columns, preview, description]. Don't useaccessors. - Repos: SSH git URLs only. No local paths in the
repos:block. - Print the
nao_config.yamlsummary and get user confirmation beforenao sync. - Never have the user paste their LLM key into chat.
- Don't ask before invoking
write-context-rules— just hand off.
Appendix — nao_config.yaml skeleton (BigQuery example)
Use this shape and adapt the databases: block per warehouse — see docs.getnao.io/nao-agent/context-builder/databases for the exact required/optional fields for Snowflake, Postgres, Redshift, Databricks, Athena, ClickHouse, Fabric, MSSQL, MySQL, Trino.
project_name: <project>
databases:
- type: bigquery
name: <connection-name>
project_id: <gcp-project-id>
dataset_id: <dataset>
credentials_path: /path/to/service-account.json # or `sso: true`
include: ['<dataset_pattern>.<table_pattern>'] # e.g. "analytics.fct_*" - use '*' as multiple patterns
exclude: ['<pattern>']
templates: [columns, preview, description]
llm:
provider: anthropic # openai | bedrock | azure | gemini | mistral | ollama
api_key: ${ANTHROPIC_API_KEY}
repos:
- name: <repo-name>
url: git@github.com:<org>/<repo>.git # SSH only
More from getnao/nao
write-context-rules
Create or extend a nao project's RULES.md. Owns the RULES.md template. Use when the user wants to generate the initial RULES.md from synced metadata (called by setup-context), or improve their existing RULES.md. Do not use for first-time scope setup (use setup-context) or for diagnosing existing problems (use audit-context).
22audit-context
Diagnose the health of a nao context at any stage of its lifecycle. Use when the user wants a structured review of what's been synced, how RULES.md compares to the target structure, whether every table is documented, whether the data model is MECE, whether tests exist and what their failures reveal, and whether context files are bloated. Outputs a structured audit report with ranked recommendations. Do not use for first-time setup (setup-context) or routine rule writing (write-context-rules).
20create-context-tests
Generate a test suite of natural-language → SQL pairs that becomes the quality benchmark for a nao agent, then run it via `nao test`. Use when the user wants to start measuring agent reliability, extend an existing test suite, or add tests for new metrics. Tests are the only honest answer to "is the context working?". Do not use for writing rules (write-context-rules) or diagnosing failures (audit-context).
20add-semantic-layer
Wire a semantic layer into a nao agent so that metric queries are routed through a single source of truth. Supports dbt MetricFlow (dbt Cloud with Semantic Layer), Snowflake (views or semantic views via MCP), an in-house nao YAML semantic layer, or other tools (via MCP discovery). Installs the right MCP server, updates RULES.md to route metric queries through the semantic layer, and (for the nao YAML option) generates starter metric files. Use after a first round of tests has shown the agent struggling with metric reliability. Do not use for raw rule writing (write-context-rules) or first-time setup (setup-context).
19