data-tools
SKILL.md
Data Tools Skill
Critical Rule
NEVER use grep, sed, or awk on JSON, YAML, TOML, XML, or CSV data.
These text-processing tools treat structured data as flat text. They break on multi-line values, nested structures, quoted strings containing delimiters, and field reordering. Use the right tool for the format.
Tool Selection Guide
| Format | Tool | Notes |
|---|---|---|
| JSON | jq | Or gh --jq for GitHub CLI output |
| YAML | yq | Same jq-like syntax, in-place editing |
| TOML | dasel | Native TOML support |
| XML | dasel | Or xmlstarlet for XPath |
| CSV / TSV | qsv | Fast, memory-efficient, purpose-built |
| Mixed / multiple | dasel | Universal selector, auto-detects format |
Quick decision:
- One format, one file? Use the format-specific tool (jq/yq/qsv).
- Multiple formats or TOML/XML? Use dasel.
- GitHub CLI output? Use
gh --jqflag directly (never pipe to jq).
Quick Examples
jq -- JSON
jq -r '.version' package.json
jq '.users[] | select(.role == "admin")' users.json
jq '.version = "2.0.0"' pkg.json > pkg.json.tmp && mv pkg.json.tmp pkg.json
yq -- YAML
yq '.services.web.image' docker-compose.yml
yq -i '.jobs.test.strategy.matrix.php-version = ["8.2", "8.3", "8.4"]' .github/workflows/ci.yml
dasel -- TOML / XML / Universal
dasel -f Cargo.toml '.package.version'
dasel put -f config.json -t string -v "localhost" '.database.host'
dasel -f input.json -w yaml
qsv -- CSV / TSV
qsv headers data.csv && qsv stats data.csv --everything | qsv table
qsv search -s status "active" users.csv | qsv select name,email
GitHub CLI -- always use --jq
gh api repos/owner/repo/releases --jq '.[0].tag_name'
gh pr list --json number,title --jq '.[] | "\(.number)\t\(.title)"'
Anti-Patterns
# BAD: grep/sed on JSON (breaks on formatting, nesting, escapes)
grep '"version"' package.json | sed 's/.*: "\(.*\)".*/\1/'
# GOOD:
jq -r '.version' package.json
# BAD: sed on YAML (ignores indentation, multi-line values)
sed -i 's/image: node:.*/image: node:20/' docker-compose.yml
# GOOD:
yq -i '.services.app.image = "node:20"' docker-compose.yml
# BAD: awk on CSV (breaks on quoted fields containing commas)
awk -F',' '{print $2}' data.csv
# GOOD:
qsv select 2 data.csv
References
| Cookbook | Content |
|---|---|
| jq Cookbook | Extraction, filtering, transformation, GitHub CLI patterns |
| yq Cookbook | YAML editing, GitHub Actions, Docker-Compose, Kubernetes |
| dasel Cookbook | TOML/XML editing, format conversion, universal selector |
| CSV Processing | qsv workflows, joins, stats, large file handling |
External docs: jq manual | yq docs | dasel docs | qsv docs
Weekly Installs
5
Repository
netresearch/dat…ls-skillFirst Seen
7 days ago
Security Audits
Installed on
opencode5
gemini-cli5
github-copilot5
codex5
kimi-cli5
amp5