bigquery
BigQuery CLI Skill
You are a BigQuery specialist using the bigquery CLI tool. This skill provides comprehensive guidance for working with Google BigQuery through a unified TypeScript/Bun CLI with query execution, cost awareness, and MCP server integration.
Core Capabilities
The bigquery CLI provides:
- Authentication: Check status and gcloud integration
- Query Execution: Run SQL queries with automatic cost estimation and confirmation prompts
- Dry Run: Estimate query costs ($0 to run)
- Dataset Operations: List, create, update, delete, describe datasets
- Table Operations: List, describe, insert, load, extract, create, update, delete, copy tables
- Job Management: List, get, and cancel BigQuery jobs
- MCP Server: Built-in stdio and HTTP modes for AI integration (read-only by default,
--enable-writesflag) - Streaming Support: Native
--streamflag for large result sets (outputs JSONL) - Output Formats: JSON (default), JSONL (streaming), and text (human-readable)
Authentication
Check Authentication Status
# Check if authenticated and verify required scopes
bigquery auth check
# Output shows:
# - Authentication status
# - Active account
# - Token expiration
# - BigQuery scopes availability
Authentication Methods
Uses Google Cloud SDK (gcloud) authentication:
- Application Default Credentials (ADC)
- OAuth access tokens
- Service account keys (via
GOOGLE_APPLICATION_CREDENTIALS)
No separate BigQuery authentication required - uses existing gcloud credentials.
Best Practice: Always run bigquery auth check first to verify authentication before operations.
Output Format Defaults
Different commands have different default output formats:
| Command | Default Format | Notes |
|---|---|---|
query |
JSON | Machine-readable for pipelines, includes metadata |
datasets list |
text | Human-readable |
tables list |
text | Human-readable |
tables describe |
text | Human-readable |
jobs list |
text | Human-readable |
All commands support --format json or --format text to override defaults.
Available Formats:
- JSON: Structured output with metadata (rows, bytesProcessed, cost, cacheHit)
- Text: Human-readable formatted output
- JSONL: Newline-delimited JSON (one object per line), ideal for streaming and pipelines
Query Operations
Running Queries
# Basic query execution (automatic dry-run + cost confirmation)
bigquery query "SELECT * FROM dataset.table LIMIT 10"
# Skip cost confirmation for automation
bigquery query --yes "SELECT COUNT(*) FROM dataset.table"
# JSON output (default, includes full metadata)
bigquery query "SELECT * FROM dataset.table LIMIT 5"
# JSONL output (one JSON object per line, ideal for pipelines)
bigquery query "SELECT * FROM dataset.table" --jsonl
# Stream large results (implies JSONL format)
bigquery query "SELECT * FROM large_table" --stream > results.jsonl
# Text format (human-readable)
bigquery query --format text "SELECT * FROM dataset.table LIMIT 5"
# Query from file
bigquery query --file query.sql
# Query from stdin
cat query.sql | bigquery query --stdin --yes
# Custom project
bigquery query --project my-other-project "SELECT 1"
Cost Awareness: The query command automatically:
- Runs a dry-run to estimate cost before execution (costs $0)
- Displays bytes to be processed and estimated cost
- Prompts for confirmation if cost exceeds threshold (default: 1 GB)
- Skips confirmation for queries below threshold or when
--yesis used - Includes cost metadata in response (bytesProcessed, estimatedCostUSD, cacheHit)
Query Options
# Skip cost confirmation (REQUIRED for automation/scripts)
bigquery query --yes "SELECT * FROM dataset.table"
# Set custom cost threshold (default: 1 GB)
bigquery query --cost-threshold 5 "SELECT * FROM large_table"
# Environment variable for persistent threshold
export BIGQUERY_COST_THRESHOLD_GB=5.0
# Limit rows returned (does NOT reduce cost/bytes scanned)
bigquery query "SELECT * FROM table" --max-results 100
# Stream large results (outputs JSONL)
bigquery query "SELECT * FROM large_table" --stream > results.jsonl
Query Flags:
| Flag | Description |
|---|---|
--yes / -y |
Skip cost confirmation prompt (for automation) |
--format <format> |
Output format: json (default), text, jsonl |
--jsonl |
Shorthand for --format jsonl |
--stream |
Stream results as they arrive (implies JSONL) |
--cost-threshold <gb> |
Cost confirmation threshold in GB (default: 1) |
--max-results <n> |
Max rows to return (does NOT reduce cost) |
--file <path> |
Read query from file |
--stdin |
Read query from stdin |
-p, --project <id> |
GCP project ID |
Important Notes:
--max-resultsonly limits returned rows, NOT bytes scanned (no cost reduction)- Use
LIMITin SQL to reduce scanned data - Use
--yesfor automation (not--force, which doesn't exist)
Query Output Formats
# JSON output (default, machine-readable)
bigquery query "SELECT * FROM dataset.table"
# Returns:
# {
# "rows": [...],
# "totalRows": 123,
# "bytesProcessed": 1048576,
# "bytesProcessedGB": 0.001,
# "estimatedCostUSD": 0.00000625,
# "cacheHit": false
# }
# JSONL output (one JSON object per line)
bigquery query "SELECT * FROM dataset.table" --jsonl
# Returns:
# {"col1":"value1","col2":123}
# {"col1":"value2","col2":456}
# Text output (human-readable table)
bigquery query --format text "SELECT * FROM dataset.table"
Dry Run (Cost Estimation)
# Estimate cost without executing
bigquery dry-run "SELECT * FROM large_dataset.table WHERE date >= '2025-01-01'"
# Returns (JSON):
# {
# "bytesProcessed": "1073741824",
# "bytesProcessedGB": 1.0,
# "bytesProcessedMB": 1024.0,
# "estimatedCostUSD": 0.00625
# }
# Text format
bigquery dry-run "SELECT * FROM large_table" --format text
# Returns:
# Dry-run Results:
# ──────────────────────────────────────────────────
# Bytes to process: 1.00 GB
# Estimated cost: $0.0063
Use dry-run to:
- Estimate costs before running expensive queries
- Validate query syntax
- Check partition pruning effectiveness
- Test queries in CI/CD pipelines
Cost Formula: (bytesProcessed / 1TB) * $6.25
Dataset Operations
Listing Datasets
# List datasets in current project (text format, default)
bigquery datasets list
# List datasets in another project
bigquery datasets list --project other-project
# JSON output
bigquery datasets list --format json
# Example output shows:
# - Dataset ID
# - Location
# - Creation time
# - Labels (if any)
Describing Datasets
# Show dataset metadata
bigquery datasets describe project.dataset
# JSON output
bigquery datasets describe project.dataset --format json
Creating Datasets
# Create dataset in default location
bigquery datasets create my-project.new_dataset
# Create with description and location
bigquery datasets create my-project.new_dataset \
--description "Analytics data warehouse" \
--location US
# Create with default table expiration (30 days)
bigquery datasets create my-project.temp_dataset \
--default-ttl 30 \
--location US
# Create with labels
bigquery datasets create my-project.new_dataset \
--labels "env=prod,team=analytics"
Updating Datasets
# Update description
bigquery datasets update my-project.existing_dataset \
--description "Updated description"
# Update default table expiration
bigquery datasets update my-project.existing_dataset \
--default-ttl 30
# Add/update labels
bigquery datasets update my-project.existing_dataset \
--labels env=staging \
--labels team=data
Deleting Datasets
# Delete empty dataset (prompts for confirmation)
bigquery datasets delete my-project.old_dataset
# Delete non-empty dataset (includes all tables)
bigquery datasets delete my-project.old_dataset --force
# Skip confirmation (for automation)
bigquery datasets delete my-project.old_dataset --force --yes
Table Operations
Listing Tables
# List tables in a dataset (text format, first 10)
bigquery tables list my-project.my-dataset
# JSON output
bigquery tables list my-project.my-dataset --format json
# With pagination
bigquery tables list my-project.my-dataset --max-results 20 --page-token <token>
Describing Table Schema
# Show table schema and metadata (text format)
bigquery tables describe my-project.my-dataset.my-table
# JSON output
bigquery tables describe my-project.my-dataset.my-table --format json
# Output includes:
# - Column names and types
# - Nullability (NULLABLE, REQUIRED, REPEATED)
# - Mode information
# - Table metadata (row count, size, location)
Creating Tables
# Create table with JSON schema
bigquery tables create my-project.dataset.users \
--schema '[{"name":"id","type":"STRING"},{"name":"email","type":"STRING"}]' \
--description "User data"
Updating Tables
# Update description
bigquery tables update my-project.dataset.table --description "Updated description"
# Set expiration (30 days in seconds)
bigquery tables update my-project.dataset.table --expiration 2592000
# Add labels
bigquery tables update my-project.dataset.table \
--labels owner=team-data \
--labels environment=production
# Require partition filter for queries
bigquery tables update my-project.dataset.table --require-partition-filter true
Copying Tables
# Copy table within same project
bigquery tables copy my-project.source.table my-project.dest.table_copy
# Copy to different dataset
bigquery tables copy my-project.source.table my-project.archive.table_backup
# Overwrite destination if exists
bigquery tables copy my-project.source.table my-project.dest.existing_table \
--write-disposition WRITE_TRUNCATE
Deleting Tables
# Delete table
bigquery tables delete my-project.dataset.old_table
Inserting Rows (Small Datasets)
Best for <1000 rows. Uses streaming insert API for immediate availability.
JSONL (Newline-Delimited JSON) Format
From JSONL File:
# Create sample JSONL file
cat > users.jsonl <<EOF
{"id": "1", "name": "Alice Johnson", "email": "alice@example.com", "age": 30}
{"id": "2", "name": "Bob Smith", "email": "bob@example.com", "age": 25}
{"id": "3", "name": "Charlie Brown", "email": "charlie@example.com", "age": 35}
EOF
# Insert from JSONL file
bigquery tables insert my-project.dataset.users users.jsonl --format jsonl
From JSONL Stream (stdin):
# Stream from command output
echo '{"id": "1", "name": "Alice", "email": "alice@example.com"}' | \
bigquery tables insert my-project.dataset.users - --format jsonl
# Stream from heredoc
cat << EOF | bigquery tables insert my-project.dataset.users - --format jsonl
{"id": "1", "name": "Alice", "email": "alice@example.com", "age": 30}
{"id": "2", "name": "Bob", "email": "bob@example.com", "age": 25}
{"id": "3", "name": "Charlie", "email": "charlie@example.com", "age": 35}
EOF
# Stream from application output
my-etl-tool --output jsonl | bigquery tables insert my-project.dataset.events -
# Stream from jq transformation
cat raw_data.json | jq -c '.records[]' | \
bigquery tables insert my-project.dataset.processed -
JSONL Format Requirements:
- Each line is a separate JSON object
- Empty lines are automatically skipped
- No commas between objects
- Ideal for streaming and large datasets
- Format:
{"field1":"value1","field2":"value2"}\n
Additional Insert Options
# Skip invalid rows instead of failing
bigquery tables insert my-project.dataset.users users.jsonl --skip-invalid-rows
# Ignore unknown fields in data
bigquery tables insert my-project.dataset.users users.jsonl --ignore-unknown-values
# Combine options for production pipelines
cat production_data.jsonl | \
bigquery tables insert my-project.dataset.production - \
--format jsonl \
--skip-invalid-rows \
--ignore-unknown-values
Insert Options:
--format <FORMAT>: Data format (json or jsonl)--skip-invalid-rows: Skip invalid rows instead of failing--ignore-unknown-values: Ignore unknown fields in data
Loading Data (Large Datasets)
Best for >10MB files or >1000 rows. Uses BigQuery load jobs.
# Load from Cloud Storage URI (RECOMMENDED)
bigquery tables load my-project.dataset.users \
gs://my-bucket/data.csv --format csv
# Load with schema auto-detection
bigquery tables load my-project.dataset.new_table \
gs://my-bucket/data.csv --format csv --autodetect
# Load with replace write disposition (truncates table first)
bigquery tables load my-project.dataset.users \
gs://my-bucket/data.csv --format csv --write-disposition WRITE_TRUNCATE
# Load JSON file
bigquery tables load my-project.dataset.events \
gs://my-bucket/events.json --format json
# Supported formats: csv, json, jsonl, avro, parquet
bigquery tables load my-project.dataset.table \
gs://my-bucket/data.parquet --format parquet
# CSV with skip leading rows
bigquery tables load my-project.dataset.table \
gs://my-bucket/data.csv --format csv --skip-leading-rows 1
# Append to existing table (default)
bigquery tables load my-project.dataset.table \
gs://my-bucket/data.json --format json --write-disposition WRITE_APPEND
Load Options:
--format <FORMAT>: Source format (csv, json, jsonl, parquet, avro)--autodetect: Auto-detect schema from source--skip-leading-rows <N>: Skip N leading rows (CSV)--write-disposition <MODE>: WRITE_TRUNCATE (replace), WRITE_APPEND (default), WRITE_EMPTY--create-disposition <MODE>: CREATE_IF_NEEDED (default), CREATE_NEVER
When to Use:
- Large datasets (>1000 rows or >10MB)
- Data already in Cloud Storage
- Bulk data migrations
When NOT to Use:
- Small datasets (<1000 rows) → Use
tables insertinstead
Extracting Data
Export table data to Cloud Storage in various formats:
# Extract table to Cloud Storage as CSV
bigquery tables extract my-project.dataset.users \
gs://my-bucket/exports/users.csv --format csv
# Extract as JSON (newline-delimited)
bigquery tables extract my-project.dataset.events \
gs://my-bucket/exports/events-*.json --format json
# Extract with compression
bigquery tables extract my-project.dataset.large_table \
gs://my-bucket/exports/data-*.csv.gz --format csv --compression GZIP
# Extract as Parquet
bigquery tables extract my-project.dataset.analytics \
gs://my-bucket/exports/analytics.parquet --format parquet
# CSV with header
bigquery tables extract my-project.dataset.data \
gs://my-bucket/data.csv --format csv --print-header
Supported Formats: CSV, JSON (newline-delimited), Parquet, Avro Compression: NONE (default), GZIP
Job Management
BigQuery jobs are asynchronous operations for queries, loads, exports, and copies.
Listing Jobs
# List recent jobs
bigquery jobs list
# List with pagination
bigquery jobs list --max-results 20 --page-token <token>
# Filter by state
bigquery jobs list --state-filter running
bigquery jobs list --state-filter done
# Show jobs from all users
bigquery jobs list --all-users
# JSON output
bigquery jobs list --format json
Showing Job Details
# Show job details
bigquery jobs get job_abc123xyz
# JSON output
bigquery jobs get job_abc123xyz --format json
Canceling Jobs
# Cancel a running job
bigquery jobs cancel job_abc123xyz
MCP Server Integration
The BigQuery MCP server provides AI integration via Model Context Protocol.
Starting MCP Server
STDIO Mode (for local clients):
# Start MCP server in stdio mode (read-only)
bigquery mcp stdio
# Start with write operations enabled
bigquery mcp stdio --enable-writes
# Server will:
# - Accept MCP protocol messages on stdin
# - Send responses on stdout
# - Expose BigQuery tools to MCP clients
HTTP Mode (for network clients):
# Start HTTP MCP server on default port 8080
bigquery mcp http
# Specify custom port
bigquery mcp http --port 3000
# With OAuth and email domain restriction
bigquery mcp http \
--google-client-id <id> \
--google-client-secret <secret> \
--domain example.com
# Enable write operations
bigquery mcp http --enable-writes
# Server provides:
# - HTTP endpoint for MCP protocol
# - JSON-RPC over HTTP
# - Remote access to BigQuery tools
MCP Configuration
Configure in Claude Code or other MCP-enabled applications:
STDIO Mode (.claude/mcp.json or similar):
{
"mcpServers": {
"bigquery": {
"command": "bigquery",
"args": ["mcp", "stdio"],
"env": {
"GOOGLE_CLOUD_PROJECT": "my-project"
}
}
}
}
STDIO Mode with Writes Enabled:
{
"mcpServers": {
"bigquery": {
"command": "bigquery",
"args": ["mcp", "stdio", "--enable-writes"],
"env": {
"GOOGLE_CLOUD_PROJECT": "my-project"
}
}
}
}
HTTP Mode:
{
"mcpServers": {
"bigquery": {
"url": "http://localhost:8080",
"transport": "http"
}
}
}
Common Workflows
Workflow 1: Exploratory Data Analysis
# 1. Verify authentication
bigquery auth check
# 2. List available datasets
bigquery datasets list
# 3. List tables in dataset
bigquery tables list my-project.analytics
# 4. Check table schema
bigquery tables describe my-project.analytics.events
# 5. Preview data (text format for readability)
bigquery query --format text \
"SELECT * FROM my-project.analytics.events LIMIT 10"
# 6. Get row count
bigquery query "SELECT COUNT(*) as total FROM my-project.analytics.events"
# 7. Check data distribution
bigquery query --format text "
SELECT
DATE(timestamp) as date,
COUNT(*) as events
FROM my-project.analytics.events
GROUP BY date
ORDER BY date DESC
LIMIT 30
"
Workflow 2: Cost-Aware Query Development
# 1. Dry run to estimate cost
bigquery dry-run "
SELECT *
FROM my-project.large_dataset.table
WHERE date >= '2025-01-01'
"
# 2. If cost is acceptable, run query
bigquery query "
SELECT *
FROM my-project.large_dataset.table
WHERE date >= '2025-01-01'
"
# 3. For automation, skip confirmation
bigquery query --yes "
SELECT *
FROM my-project.large_dataset.table
WHERE date >= '2025-01-01'
" > results.json
Workflow 3: Data Loading Pipeline
# 1. Load initial data
bigquery tables load my-project.dataset.events \
gs://bucket/events-2025-01-01.csv \
--format csv \
--write-disposition WRITE_TRUNCATE
# 2. Append incremental data
bigquery tables load my-project.dataset.events \
gs://bucket/events-2025-01-02.csv \
--format csv \
--write-disposition WRITE_APPEND
# 3. Verify data loaded
bigquery query "
SELECT
DATE(timestamp) as date,
COUNT(*) as count
FROM my-project.dataset.events
GROUP BY date
ORDER BY date
"
Workflow 4: Real-Time Data Insertion
# 1. Stream JSONL from application
my-app --output jsonl | bigquery tables insert my-project.dataset.events -
# 2. Stream with transformation and error handling
cat raw_events.json | jq -c '.events[]' | \
bigquery tables insert my-project.dataset.events - \
--skip-invalid-rows \
--ignore-unknown-values
Workflow 5: Streaming Large Results
# 1. Stream query results to JSONL file
bigquery query "SELECT * FROM my-project.dataset.large_table" \
--stream > results.jsonl
# 2. Process with DuckDB
duckdb -c "SELECT user_id, COUNT(*) FROM read_json_auto('results.jsonl') GROUP BY user_id"
# 3. Or process line-by-line
cat results.jsonl | while IFS= read -r line; do
echo "$line" | jq '.field'
done
Best Practices
Query Development
- Always dry-run first: Use
bigquery dry-runto estimate costs - Validate before running: Check syntax and cost before execution
- Use text format for exploration:
--format textfor human-readable tables - Use JSON for automation:
--format jsonfor machine processing - Use JSONL for streaming:
--jsonlor--streamfor large results - Skip confirmations in scripts: Use
--yesflag for automation
Cost Management
- Dry run expensive queries: Always estimate with
bigquery dry-run - Monitor bytes processed: Check query cost estimates before running
- Use partition pruning: Filter on partitioned columns in WHERE clauses
- Limit result sets: Use LIMIT for exploratory queries
- Select only needed columns:
SELECT col1, col2notSELECT * - Set cost thresholds: Use
BIGQUERY_COST_THRESHOLD_GBenvironment variable
Authentication
- Check auth first: Run
bigquery auth checkbefore operations - Use service accounts: For automation and CI/CD
- Verify scopes: Ensure all required BigQuery scopes are granted
- Re-authenticate when needed: If check fails
Data Loading
- Choose the right method:
- Use
insertfor <1000 rows (streaming insert API, immediate availability) - Use
loadfor >10MB files or >1000 rows (load jobs with Cloud Storage)
- Use
- Use JSONL for streaming: Newline-delimited JSON is ideal for streaming pipelines
- Stream from stdin: Use
-as file argument to pipe data from applications - Handle bad records: Use
--skip-invalid-rowsfor messy data - Choose write disposition:
WRITE_TRUNCATEfor full refresh,WRITE_APPENDfor incremental - Use appropriate formats: CSV for simple data, JSON/JSONL for complex, Parquet for large datasets
Output Format Selection
- JSON (
--format json): Default, full metadata, parsing withjq - Text (
--format text): Human-readable, terminal inspection - JSONL (
--jsonlor--stream): Streaming, pipelines, DuckDB ingestion
Examples:
# Parse with jq
bigquery query "SELECT * FROM table" | jq '.rows[] | select(.amount > 100)'
# Stream to file
bigquery query "SELECT * FROM large_table" --stream > data.jsonl
# Load into DuckDB
bigquery query "SELECT * FROM table" --jsonl | \
duckdb -c "SELECT * FROM read_json_auto('/dev/stdin')"
MCP Server
- Use stdio for local: Prefer stdio mode for local MCP clients
- Use HTTP for remote: Use HTTP mode for networked deployments
- Read-only by default: Only enable writes when needed (
--enable-writes) - Secure HTTP endpoints: Put HTTP server behind authentication/firewall
- Monitor server logs: Check for errors and performance issues
Configuration
Environment Variables
# Set default project
export GOOGLE_CLOUD_PROJECT=my-project
# Set cost threshold (in GB, default: 1)
export BIGQUERY_COST_THRESHOLD_GB=5.0
# Set credentials (for service accounts)
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Add to ~/.zshrc or ~/.bashrc for persistence
echo 'export GOOGLE_CLOUD_PROJECT=my-project' >> ~/.zshrc
echo 'export BIGQUERY_COST_THRESHOLD_GB=5.0' >> ~/.zshrc
Authentication Methods
User Credentials (interactive):
gcloud auth application-default login
bigquery auth check
Service Account (automation):
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/sa-key.json
bigquery auth check
Troubleshooting
Issue: "Not authenticated" or "Permission denied"
Solution: Check authentication and scopes
# Check current auth status
bigquery auth check
# Re-authenticate if needed
gcloud auth application-default login
# Verify gcloud is set to correct project
gcloud config get-value project
# Set project if needed
gcloud config set project my-project
Issue: "Table not found"
Solution: Use fully qualified table names
# Wrong - missing project/dataset
bigquery query "SELECT * FROM table"
# Correct - fully qualified
bigquery query "SELECT * FROM my-project.my-dataset.my-table"
# Or use backticks for reserved words
bigquery query "SELECT * FROM \`my-project.my-dataset.my-table\`"
Issue: "Query too expensive"
Solution: Check cost with dry-run and optimize
# Check estimated cost
bigquery dry-run "SELECT * FROM large_table WHERE date >= '2025-01-01'"
# Optimize with partition filters
bigquery dry-run "
SELECT * FROM large_table
WHERE _PARTITIONDATE = '2025-01-15'
"
Quick Reference
# Authentication
bigquery auth check # Check auth status
# Queries (default: JSON output)
bigquery query "SELECT ..." # Execute query (JSON)
bigquery query --yes "SELECT ..." # Skip confirmation
bigquery query --format text "SELECT ..." # Human-readable table
bigquery query --jsonl "SELECT ..." # JSONL output
bigquery query --stream "SELECT ..." # Stream large results
bigquery dry-run "SELECT ..." # Estimate cost
# Datasets
bigquery datasets list # List datasets
bigquery datasets describe PROJECT.DATASET # Describe dataset
bigquery datasets create PROJECT.DATASET # Create dataset
bigquery datasets update PROJECT.DATASET --description "..." # Update dataset
bigquery datasets delete PROJECT.DATASET # Delete dataset
# Tables - Read Operations
bigquery tables list PROJECT.DATASET # List tables
bigquery tables describe TABLE # Show schema
# Tables - Write Operations
bigquery tables create TABLE --schema "..." # Create table
bigquery tables insert TABLE file.jsonl --format jsonl # Insert from file
cat data.jsonl | bigquery tables insert TABLE - # Stream insert
bigquery tables load TABLE gs://bucket/file.csv # Bulk load
bigquery tables copy SOURCE DEST # Copy table
bigquery tables delete TABLE # Delete table
# Tables - Extract
bigquery tables extract TABLE gs://bucket/output.csv # Export to GCS
# Jobs
bigquery jobs list # List jobs
bigquery jobs get JOB_ID # Job details
bigquery jobs cancel JOB_ID # Cancel job
# MCP Server
bigquery mcp stdio # MCP server (stdio, read-only)
bigquery mcp stdio --enable-writes # MCP server (stdio, with writes)
bigquery mcp http --port 3000 # MCP server (HTTP)
Summary
Primary commands:
bigquery auth check- Authentication managementbigquery query- Execute SQL with automatic cost awarenessbigquery dry-run- Estimate query costs ($0 to run)bigquery datasets {list,describe,create,update,delete}- Dataset operationsbigquery tables {list,describe,insert,load,extract,create,update,delete,copy}- Table operationsbigquery jobs {list,get,cancel}- Job managementbigquery mcp {stdio,http}- MCP server modes
Key features:
- Cost-aware query execution with automatic dry-run and confirmation prompts
- Configurable cost thresholds (
BIGQUERY_COST_THRESHOLD_GB) - Streaming insert API for real-time data (<1000 rows)
- Bulk load from Cloud Storage (>10MB files)
- Native streaming support for large results (
--streamflag outputs JSONL) - Built-in MCP server for AI integration (stdio and HTTP modes)
- Three output formats: JSON (default, full metadata), JSONL (streaming), text (human-readable)
- Instant startup with TypeScript/Bun (~10ms vs ~500ms for Python
bq)