xargs-parallel

Installation

SKILL.md

xargs and Parallel Execution

xargs Basics

Read from stdin and pass as arguments to a command:

# Basic usage: pass stdin lines as arguments
echo "file1.txt file2.txt" | xargs rm

# -I {} sets a placeholder for each input line
cat urls.txt | xargs -I {} curl -O {}

# -n controls how many arguments per command invocation
echo "a b c d e f" | xargs -n 2 echo
# Output:
# a b
# c d
# e f

# -t prints each command before executing (trace mode)
ls *.log | xargs -t rm

# Read arguments from a file
xargs -a filelist.txt rm

xargs with find

Always use -print0 / -0 to handle filenames with spaces and special characters:

# Safe deletion of files matching a pattern
find . -name "*.tmp" -print0 | xargs -0 rm -f

# Count lines in all Python files
find . -name "*.py" -print0 | xargs -0 wc -l

# Change permissions on specific files
find /var/log -name "*.log" -print0 | xargs -0 chmod 644

# Grep across files found by find
find src/ -name "*.js" -print0 | xargs -0 grep -l "TODO"

Parallel Execution with xargs -P

-P N runs up to N processes in parallel:

# Compress files using 4 parallel jobs
find . -name "*.log" -print0 | xargs -0 -P 4 -I {} gzip {}

# Download URLs in parallel (8 at a time)
cat urls.txt | xargs -P 8 -I {} curl -sO {}

# Convert images in parallel
find . -name "*.png" -print0 | xargs -0 -P 4 -I {} convert {} -resize 50% resized_{}

# Use all available cores
find . -name "*.gz" -print0 | xargs -0 -P "$(nproc)" gunzip

GNU parallel Basics

GNU parallel offers more features if installed (brew install parallel / apt install parallel):

# Basic usage (similar to xargs -P)
cat urls.txt | parallel curl -sO {}

# Control job count
find . -name "*.csv" | parallel -j 8 gzip {}

# Progress bar
find . -name "*.mp4" | parallel --bar ffmpeg -i {} -vf scale=640:-1 small_{}

# Retry failed jobs
cat urls.txt | parallel --retries 3 curl -sO {}

# Distribute jobs across multiple machines (SSH)
parallel -S server1,server2 --transferfile {} gzip ::: *.log

# Keep output order matching input order
seq 10 | parallel -k 'sleep $((RANDOM % 3)); echo {}'

Common Patterns

Bulk Rename Files

# Add a prefix
ls *.jpg | xargs -I {} mv {} archive_{}

# Change extension (using parameter expansion in a subshell)
find . -name "*.txt" -print0 | xargs -0 -I {} bash -c 'mv "$1" "${1%.txt}.md"' _ {}

# Lowercase all filenames in current directory
ls | xargs -I {} bash -c 'mv "$1" "$(echo "$1" | tr "A-Z" "a-z")"' _ {}

Process All Files Matching a Pattern

# Format all Go files
find . -name "*.go" -print0 | xargs -0 gofmt -w

# Lint all JS files
find src/ -name "*.js" -print0 | xargs -0 eslint --fix

# Run a script against each config file
find /etc -name "*.conf" -print0 | xargs -0 -I {} ./validate-config.sh {}

Download a List of URLs

# Download all URLs from a file, 10 in parallel
cat urls.txt | xargs -P 10 -I {} curl -sfLO {}

# Download with wget, retrying failures
cat urls.txt | xargs -P 5 -I {} wget -q --retry-connrefused --tries=3 {}

# With GNU parallel and a progress bar
parallel --bar -j 10 curl -sfLO {} < urls.txt

Run Tests in Parallel

# Run test files in parallel
find tests/ -name "test_*.py" | xargs -P 4 -I {} python -m pytest {} -v

# Run multiple test suites concurrently
echo "unit integration e2e" | xargs -n 1 -P 3 -I {} make test-{}

Batch API Calls

# POST each JSON file to an API endpoint
find data/ -name "*.json" -print0 | xargs -0 -P 5 -I {} \
  curl -s -X POST -H "Content-Type: application/json" -d @{} https://api.example.com/ingest

# Process user IDs from a file
cat user_ids.txt | xargs -P 10 -I {} \
  curl -s "https://api.example.com/users/{}" -o "responses/{}.json"

Parallel Image Compression

# Compress PNGs in parallel with pngquant
find . -name "*.png" -print0 | xargs -0 -P "$(nproc)" -I {} pngquant --force --quality=65-80 {} --output {}

# Resize JPEGs with ImageMagick
find photos/ -name "*.jpg" -print0 | xargs -0 -P 4 -I {} \
  convert {} -resize 1920x1080\> -quality 85 optimized/{}

Bulk Git Operations Across Repos

# Pull latest in all repos under a directory
find ~/projects -maxdepth 2 -name ".git" -type d -print0 | \
  xargs -0 -P 8 -I {} git -C "{}/.." pull --ff-only

# Check status of all repos
find ~/projects -maxdepth 2 -name ".git" -type d | \
  xargs -I {} bash -c 'echo "=== $(dirname {}) ===" && git -C "{}/.." status -s'

# Garbage collect all repos in parallel
find ~/projects -maxdepth 2 -name ".git" -type d -print0 | \
  xargs -0 -P 4 -I {} git -C "{}/.." gc --quiet

Handling Filenames with Spaces

# -0 expects null-delimited input (pair with find -print0)
find . -name "*.txt" -print0 | xargs -0 wc -l

# -d '\n' treats newlines as delimiters (not spaces)
ls | xargs -d '\n' -I {} echo "File: {}"

# On macOS (BSD xargs lacks -d), use -0 with tr
ls | tr '\n' '\0' | xargs -0 -I {} echo "File: {}"

Dry Run Before Executing

# Preview what would be deleted
find . -name "*.bak" -print0 | xargs -0 echo rm

# Use -p to prompt before each execution
find . -name "*.tmp" -print0 | xargs -0 -p rm

# With -t to trace commands as they run
find . -name "*.log" -print0 | xargs -0 -t gzip

Error Handling

# xargs exits with 123 if any command fails
find . -name "*.sh" -print0 | xargs -0 -P 4 bash  # check $?

# GNU parallel: halt on first failure
cat jobs.txt | parallel --halt now,fail=1 process_job {}

# GNU parallel: halt when 20% of jobs fail
cat jobs.txt | parallel --halt soon,fail=20% process_job {}

# Capture per-job exit codes with GNU parallel
cat jobs.txt | parallel --joblog joblog.txt process_job {}
# joblog.txt contains exit status for every job

xargs vs for Loops vs while read

Use xargs when:

Processing output from find or another command
You want built-in parallelism (-P)
Batching multiple arguments per invocation (-n)

Use while read when:

You need complex logic per iteration (if/else, multiple commands)
The loop body uses shell variables that must persist across iterations

Use for loops when:

Iterating over a known, small list of items
Glob expansion is sufficient (for f in *.txt)
Readability matters more than performance

# for loop -- simple, readable, no parallelism
for f in *.txt; do wc -l "$f"; done

# while read -- complex logic per item
find . -name "*.csv" | while read -r f; do
  count=$(wc -l < "$f")
  [ "$count" -gt 1000 ] && echo "Large: $f ($count lines)"
done

# xargs -- fast, parallel, concise
find . -name "*.csv" -print0 | xargs -0 -P 4 wc -l

Resource-Aware Parallelism

# Use nproc to match available CPU cores
find . -name "*.gz" -print0 | xargs -0 -P "$(nproc)" gunzip

# Use half the cores to leave room for other work
find . -name "*.log" -print0 | xargs -0 -P "$(( $(nproc) / 2 ))" gzip

# GNU parallel: percentage-based, relative, or load-based limits
parallel -j 50% gzip ::: *.log         # 50% of cores
parallel -j -2 gzip ::: *.log          # cores minus 2
parallel --load 80% process_job ::: *  # limit by load average

# Limit concurrency for I/O-bound tasks (network, disk)
cat urls.txt | xargs -P 5 -I {} curl -sO {}

# Monitor parallel job resource usage
parallel --joblog jobs.log -j 4 heavy_task ::: input_* && column -t jobs.log

Related skills

More from 1mangesh1/dev-skills-collection

Installs

Repository

1mangesh1/dev-s…llection

GitHub Stars

First Seen

Apr 14, 2026