xargs-parallel
xargs and Parallel Execution
xargs Basics
Read from stdin and pass as arguments to a command:
# Basic usage: pass stdin lines as arguments
echo "file1.txt file2.txt" | xargs rm
# -I {} sets a placeholder for each input line
cat urls.txt | xargs -I {} curl -O {}
# -n controls how many arguments per command invocation
echo "a b c d e f" | xargs -n 2 echo
# Output:
# a b
# c d
# e f
# -t prints each command before executing (trace mode)
ls *.log | xargs -t rm
# Read arguments from a file
xargs -a filelist.txt rm
xargs with find
Always use -print0 / -0 to handle filenames with spaces and special characters:
# Safe deletion of files matching a pattern
find . -name "*.tmp" -print0 | xargs -0 rm -f
# Count lines in all Python files
find . -name "*.py" -print0 | xargs -0 wc -l
# Change permissions on specific files
find /var/log -name "*.log" -print0 | xargs -0 chmod 644
# Grep across files found by find
find src/ -name "*.js" -print0 | xargs -0 grep -l "TODO"
Parallel Execution with xargs -P
-P N runs up to N processes in parallel:
# Compress files using 4 parallel jobs
find . -name "*.log" -print0 | xargs -0 -P 4 -I {} gzip {}
# Download URLs in parallel (8 at a time)
cat urls.txt | xargs -P 8 -I {} curl -sO {}
# Convert images in parallel
find . -name "*.png" -print0 | xargs -0 -P 4 -I {} convert {} -resize 50% resized_{}
# Use all available cores
find . -name "*.gz" -print0 | xargs -0 -P "$(nproc)" gunzip
GNU parallel Basics
GNU parallel offers more features if installed (brew install parallel / apt install parallel):
# Basic usage (similar to xargs -P)
cat urls.txt | parallel curl -sO {}
# Control job count
find . -name "*.csv" | parallel -j 8 gzip {}
# Progress bar
find . -name "*.mp4" | parallel --bar ffmpeg -i {} -vf scale=640:-1 small_{}
# Retry failed jobs
cat urls.txt | parallel --retries 3 curl -sO {}
# Distribute jobs across multiple machines (SSH)
parallel -S server1,server2 --transferfile {} gzip ::: *.log
# Keep output order matching input order
seq 10 | parallel -k 'sleep $((RANDOM % 3)); echo {}'
Common Patterns
Bulk Rename Files
# Add a prefix
ls *.jpg | xargs -I {} mv {} archive_{}
# Change extension (using parameter expansion in a subshell)
find . -name "*.txt" -print0 | xargs -0 -I {} bash -c 'mv "$1" "${1%.txt}.md"' _ {}
# Lowercase all filenames in current directory
ls | xargs -I {} bash -c 'mv "$1" "$(echo "$1" | tr "A-Z" "a-z")"' _ {}
Process All Files Matching a Pattern
# Format all Go files
find . -name "*.go" -print0 | xargs -0 gofmt -w
# Lint all JS files
find src/ -name "*.js" -print0 | xargs -0 eslint --fix
# Run a script against each config file
find /etc -name "*.conf" -print0 | xargs -0 -I {} ./validate-config.sh {}
Download a List of URLs
# Download all URLs from a file, 10 in parallel
cat urls.txt | xargs -P 10 -I {} curl -sfLO {}
# Download with wget, retrying failures
cat urls.txt | xargs -P 5 -I {} wget -q --retry-connrefused --tries=3 {}
# With GNU parallel and a progress bar
parallel --bar -j 10 curl -sfLO {} < urls.txt
Run Tests in Parallel
# Run test files in parallel
find tests/ -name "test_*.py" | xargs -P 4 -I {} python -m pytest {} -v
# Run multiple test suites concurrently
echo "unit integration e2e" | xargs -n 1 -P 3 -I {} make test-{}
Batch API Calls
# POST each JSON file to an API endpoint
find data/ -name "*.json" -print0 | xargs -0 -P 5 -I {} \
curl -s -X POST -H "Content-Type: application/json" -d @{} https://api.example.com/ingest
# Process user IDs from a file
cat user_ids.txt | xargs -P 10 -I {} \
curl -s "https://api.example.com/users/{}" -o "responses/{}.json"
Parallel Image Compression
# Compress PNGs in parallel with pngquant
find . -name "*.png" -print0 | xargs -0 -P "$(nproc)" -I {} pngquant --force --quality=65-80 {} --output {}
# Resize JPEGs with ImageMagick
find photos/ -name "*.jpg" -print0 | xargs -0 -P 4 -I {} \
convert {} -resize 1920x1080\> -quality 85 optimized/{}
Bulk Git Operations Across Repos
# Pull latest in all repos under a directory
find ~/projects -maxdepth 2 -name ".git" -type d -print0 | \
xargs -0 -P 8 -I {} git -C "{}/.." pull --ff-only
# Check status of all repos
find ~/projects -maxdepth 2 -name ".git" -type d | \
xargs -I {} bash -c 'echo "=== $(dirname {}) ===" && git -C "{}/.." status -s'
# Garbage collect all repos in parallel
find ~/projects -maxdepth 2 -name ".git" -type d -print0 | \
xargs -0 -P 4 -I {} git -C "{}/.." gc --quiet
Handling Filenames with Spaces
# -0 expects null-delimited input (pair with find -print0)
find . -name "*.txt" -print0 | xargs -0 wc -l
# -d '\n' treats newlines as delimiters (not spaces)
ls | xargs -d '\n' -I {} echo "File: {}"
# On macOS (BSD xargs lacks -d), use -0 with tr
ls | tr '\n' '\0' | xargs -0 -I {} echo "File: {}"
Dry Run Before Executing
# Preview what would be deleted
find . -name "*.bak" -print0 | xargs -0 echo rm
# Use -p to prompt before each execution
find . -name "*.tmp" -print0 | xargs -0 -p rm
# With -t to trace commands as they run
find . -name "*.log" -print0 | xargs -0 -t gzip
Error Handling
# xargs exits with 123 if any command fails
find . -name "*.sh" -print0 | xargs -0 -P 4 bash # check $?
# GNU parallel: halt on first failure
cat jobs.txt | parallel --halt now,fail=1 process_job {}
# GNU parallel: halt when 20% of jobs fail
cat jobs.txt | parallel --halt soon,fail=20% process_job {}
# Capture per-job exit codes with GNU parallel
cat jobs.txt | parallel --joblog joblog.txt process_job {}
# joblog.txt contains exit status for every job
xargs vs for Loops vs while read
Use xargs when:
- Processing output from
findor another command - You want built-in parallelism (
-P) - Batching multiple arguments per invocation (
-n)
Use while read when:
- You need complex logic per iteration (if/else, multiple commands)
- The loop body uses shell variables that must persist across iterations
Use for loops when:
- Iterating over a known, small list of items
- Glob expansion is sufficient (
for f in *.txt) - Readability matters more than performance
# for loop -- simple, readable, no parallelism
for f in *.txt; do wc -l "$f"; done
# while read -- complex logic per item
find . -name "*.csv" | while read -r f; do
count=$(wc -l < "$f")
[ "$count" -gt 1000 ] && echo "Large: $f ($count lines)"
done
# xargs -- fast, parallel, concise
find . -name "*.csv" -print0 | xargs -0 -P 4 wc -l
Resource-Aware Parallelism
# Use nproc to match available CPU cores
find . -name "*.gz" -print0 | xargs -0 -P "$(nproc)" gunzip
# Use half the cores to leave room for other work
find . -name "*.log" -print0 | xargs -0 -P "$(( $(nproc) / 2 ))" gzip
# GNU parallel: percentage-based, relative, or load-based limits
parallel -j 50% gzip ::: *.log # 50% of cores
parallel -j -2 gzip ::: *.log # cores minus 2
parallel --load 80% process_job ::: * # limit by load average
# Limit concurrency for I/O-bound tasks (network, disk)
cat urls.txt | xargs -P 5 -I {} curl -sO {}
# Monitor parallel job resource usage
parallel --joblog jobs.log -j 4 heavy_task ::: input_* && column -t jobs.log
More from 1mangesh1/dev-skills-collection
curl-http
HTTP request construction and API testing with curl and HTTPie. Use when user asks to "test API", "make HTTP request", "curl POST", "send request", "test endpoint", "debug API", "upload file", "check response time", "set auth header", "basic auth with curl", "send JSON", "test webhook", "check status code", "follow redirects", "rate limit testing", "measure API latency", "stress test endpoint", "mock API response", or any HTTP calls from the command line.
28database-indexing
Database indexing internals, index type selection, query plan analysis, and write-overhead tradeoffs across PostgreSQL, MySQL, and MongoDB. Use when user asks to "optimize queries", "create indexes", "fix slow queries", "read EXPLAIN output", "reduce query time", "index strategy", "database performance", "composite index", "covering index", "partial index", "index bloat", "unused indexes", or needs help diagnosing and resolving database performance problems.
13testing-strategies
Testing strategies, patterns, and methodologies across the full testing spectrum. Use when asked about unit tests, integration tests, e2e tests, test pyramid, mocking, test doubles, TDD, property-based testing, snapshot testing, test coverage, mutation testing, contract testing, performance testing, test data management, CI/CD testing, flaky tests, test anti-patterns, test organization, test isolation, test fixtures, test parameterization, or any testing strategy, approach, or methodology.
10secret-scanner
This skill should be used when the user asks to "scan for secrets", "find API keys", "detect credentials", "check for hardcoded passwords", "find leaked tokens", "scan for sensitive keys", "check git history for secrets", "audit repository for credentials", or mentions secret detection, credential scanning, API key exposure, token leakage, password detection, or security key auditing.
10terraform
Terraform infrastructure as code for provisioning, modules, state management, and workspaces. Use when user asks to "create infrastructure", "write Terraform", "manage state", "create module", "import resource", "plan changes", or any IaC tasks.
10kubernetes
Kubernetes and kubectl mastery for deployments, services, pods, debugging, and cluster management. Use when user asks to "deploy to k8s", "create deployment", "debug pod", "kubectl commands", "scale service", "check pod logs", "create ingress", or any Kubernetes tasks.
10