bigquery
Installation
SKILL.md
BigQuery
Query Mozilla telemetry data directly using the bq CLI.
Prerequisites
gcloudandbqCLI installed (brew install google-cloud-sdk)- Authenticated:
gcloud auth loginwith a Mozilla account - Billing project set: queries run against a project you have
bigquery.jobs.createon - (Optional but highly recommended) mozdata-claude-plugin — provides Glean Dictionary MCP for metric/ping discovery, making it much easier to find the right tables and columns
Authentication
# Check current account
gcloud config get-value account
# Re-authenticate if needed
gcloud auth login
# List available projects
gcloud projects list --format="table(projectId,name)"
# Set billing project (mozdata is the standard choice)
gcloud config set project mozdata
If queries fail with "Access Denied", the billing project likely lacks permissions. Try --project_id=mozdata.
Running Queries
# Basic query
bq query --project_id=mozdata --use_legacy_sql=false --format=pretty "SELECT ..."
# Dry run (check cost before executing)
bq query --project_id=mozdata --use_legacy_sql=false --dry_run "SELECT ..."
Always use --project_id=mozdata and --use_legacy_sql=false.
Table Selection
Choose the right table — this is the most important optimization:
| Query Type | Table | Why |
|---|---|---|
| Windows version distribution | telemetry.windows_10_aggregate |
Pre-aggregated, instant |
| DAU/MAU by standard dimensions | firefox_desktop_derived.active_users_aggregates_v3 |
Pre-computed, 100x faster |
| DAU with custom dimensions | firefox_desktop.baseline_clients_daily |
One row per client per day |
| MAU/WAU/retention | firefox_desktop.baseline_clients_last_seen |
Bit patterns, scan 1 day not 28 |
| Event analysis | firefox_desktop.events_stream |
Pre-unnested, clustered |
| Mobile search | search.mobile_search_clients_daily_v2 |
Pre-aggregated |
| Specific Glean metric | firefox_desktop.metrics |
Raw metrics ping |
All tables are in the moz-fx-data-shared-prod project. Fully qualify as `moz-fx-data-shared-prod.{dataset}.{table}`.
Critical Rules
- Always use aggregate tables first — raw tables are 10-100x more expensive
- Always include partition filter —
submission_dateorDATE(submission_timestamp) - Use
sample_id = 0for development (1% sample) — remove for production - Say "clients" not "users" — BigQuery tracks
client_id, not actual humans - Never join across products by client_id — each product has its own namespace
- Use
events_streamfor events — never rawevents_v1(requires UNNEST) - Use
baseline_clients_last_seenfor MAU — bit patterns, scan 1 day not 28
References
references/tables.md— Detailed table schemas and common query patternsreferences/os-versions.md— Windows, macOS, and Linux version distribution queries with build number, Darwin, and kernel version mappings
Related Skills
- redash — Web UI frontend to BigQuery with visualizations and sharing
- mozdata:query-writing — Guided query writing with Glean Dictionary MCP
- mozdata:probe-discovery — Find Glean metrics and telemetry probes
Related skills
More from jwmossmoz/agent-skills
redash
>
23os-integrations
>
17worker-image-investigation
>
16treeherder
>
15bugzilla
Interact with Mozilla Bugzilla (bugzilla.mozilla.org) via REST API. Use when the user asks to search bugs, view bug details, create bugs, update bugs, add comments, or attach files. Triggers on "bugzilla", "bmo", "file a bug", "bug report", "mozilla bug".
14taskcluster
>
14