cost-calculator
ML Training Cost Calculator
Purpose: Provide production-ready cost estimation tools for ML training and inference across cloud GPU platforms (Modal, Lambda Labs, RunPod).
Activation Triggers:
- Estimating training costs for ML models
- Comparing GPU platform pricing
- Calculating GPU hours for training jobs
- Budgeting for ML projects
- Optimizing inference costs
- Evaluating cost-effectiveness of different GPU types
- Planning resource allocation
Key Resources:
scripts/estimate-training-cost.sh- Calculate training costs based on model size, data, GPU typescripts/estimate-inference-cost.sh- Estimate inference costs for production workloadsscripts/calculate-gpu-hours.sh- Convert training parameters to GPU hoursscripts/compare-platforms.sh- Compare costs across Modal, Lambda, RunPodtemplates/cost-breakdown.json- Structured cost breakdown templatetemplates/platform-pricing.yaml- Up-to-date platform pricing dataexamples/training-cost-estimate.md- Example training cost calculationexamples/inference-cost-estimate.md- Example inference cost analysis
Platform Pricing Overview
Modal (Serverless - Pay Per Second)
GPU Options:
- T4: $0.000164/sec ($0.59/hr) - Development, small models
- L4: $0.000222/sec ($0.80/hr) - Cost-effective training
- A10: $0.000306/sec ($1.10/hr) - Mid-range training
- A100 40GB: $0.000583/sec ($2.10/hr) - Large model training
- A100 80GB: $0.000694/sec ($2.50/hr) - Very large models
- H100: $0.001097/sec ($3.95/hr) - Cutting-edge training
- H200: $0.001261/sec ($4.54/hr) - Latest generation
- B200: $0.001736/sec ($6.25/hr) - Maximum performance
Free Credits:
- Starter: $30/month free
- Startup credits: Up to $50,000 FREE
Lambda Labs (On-Demand Hourly)
Single GPU:
- 1x A10: $0.31/hr - Cheapest single GPU option
- 1x V100 16GB: $0.55/hr - Most affordable multi-GPU base
8x GPU Clusters:
- 8x V100: $4.40/hr ($0.55/GPU) - Most affordable multi-GPU
- 8x A100 40GB: $10.32/hr ($1.29/GPU)
- 8x A100 80GB: $14.32/hr ($1.79/GPU)
- 8x H100: $23.92/hr ($2.99/GPU)
RunPod (Serverless - Pay Per Minute)
Key Features:
- Pay-per-minute billing
- FlashBoot <200ms cold-starts
- Zero egress fees on storage
- 30+ GPU SKUs available
Cost Estimation Scripts
1. Estimate Training Cost
Script: scripts/estimate-training-cost.sh
Usage:
bash scripts/estimate-training-cost.sh \
--model-size 7B \
--dataset-size 10000 \
--epochs 3 \
--gpu t4 \
--platform modal
Parameters:
--model-size: Model size (125M, 350M, 1B, 3B, 7B, 13B, 70B)--dataset-size: Number of training samples--epochs: Number of training epochs--batch-size: Training batch size (default: auto-calculated)--gpu: GPU type (t4, a10, a100-40gb, a100-80gb, h100)--platform: Cloud platform (modal, lambda, runpod)--peft: Use PEFT/LoRA (yes/no, default: no)--mixed-precision: Use FP16/BF16 (yes/no, default: yes)
Output:
{
"model": "7B",
"dataset_size": 10000,
"epochs": 3,
"gpu": "T4",
"platform": "Modal",
"estimated_hours": 4.2,
"cost_breakdown": {
"compute_cost": 2.48,
"storage_cost": 0.05,
"total_cost": 2.53
},
"cost_optimizations": {
"with_peft": 1.26,
"savings_percentage": 50
},
"alternative_platforms": {
"lambda_a10": 1.30,
"runpod_t4": 2.40
}
}
Calculation Methodology:
- Estimates tokens per sample (avg 500 tokens)
- Calculates total training tokens
- Applies throughput rates per GPU type
- Accounts for PEFT (90% memory reduction)
- Accounts for mixed precision (2x speedup)
2. Estimate Inference Cost
Script: scripts/estimate-inference-cost.sh
Usage:
bash scripts/estimate-inference-cost.sh \
--requests-per-day 1000 \
--avg-latency 2 \
--gpu t4 \
--platform modal \
--deployment serverless
Parameters:
--requests-per-day: Expected daily requests--avg-latency: Average inference time (seconds)--gpu: GPU type--platform: Cloud platform--deployment: Deployment type (serverless, dedicated)--batch-inference: Batch requests (yes/no, default: no)
Output:
{
"requests_per_day": 1000,
"requests_per_month": 30000,
"avg_latency_sec": 2,
"gpu": "T4",
"platform": "Modal Serverless",
"cost_breakdown": {
"daily_compute_seconds": 2000,
"daily_cost": 0.33,
"monthly_cost": 9.90,
"cost_per_request": 0.00033
},
"scaling_analysis": {
"requests_10k_day": 99.00,
"requests_100k_day": 990.00
},
"dedicated_alternative": {
"monthly_cost": 442.50,
"break_even_requests_day": 4500
}
}
3. Calculate GPU Hours
Script: scripts/calculate-gpu-hours.sh
Usage:
bash scripts/calculate-gpu-hours.sh \
--model-params 7B \
--tokens-total 30M \
--gpu a100-40gb
Parameters:
--model-params: Model parameters (125M, 350M, 1B, 3B, 7B, 13B, 70B)--tokens-total: Total training tokens--gpu: GPU type--peft: Use PEFT (yes/no)--multi-gpu: Number of GPUs (default: 1)
GPU Throughput Benchmarks:
T4 (16GB):
- 7B full fine-tune: 150 tokens/sec
- 7B with PEFT: 600 tokens/sec
A100 40GB:
- 7B full fine-tune: 800 tokens/sec
- 7B with PEFT: 3200 tokens/sec
- 13B with PEFT: 1600 tokens/sec
A100 80GB:
- 13B full fine-tune: 600 tokens/sec
- 70B with PEFT: 400 tokens/sec
H100:
- 70B with PEFT: 1200 tokens/sec
4. Compare Platforms
Script: scripts/compare-platforms.sh
Usage:
bash scripts/compare-platforms.sh \
--training-hours 4 \
--gpu-type a100-40gb
Output:
# Platform Cost Comparison
## Training Job: 4 hours on A100 40GB
| Platform | GPU Cost | Egress Fees | Total | Notes |
|----------|----------|-------------|-------|-------|
| Modal | $8.40 | $0.00 | $8.40 | Serverless, pay-per-second |
| Lambda | $5.16 | $0.00 | $5.16 | Cheapest for dedicated |
| RunPod | $8.00 | $0.00 | $8.00 | Pay-per-minute |
## Winner: Lambda Labs ($5.16)
Savings: $3.24 (38.6% vs Modal)
Recommendation: Use Lambda for long-running dedicated training, Modal for
serverless/bursty workloads.
Cost Templates
Cost Breakdown Template
Template: templates/cost-breakdown.json
{
"project_name": "ML Training Project",
"cost_estimate": {
"training": {
"model_size": "7B",
"training_runs": 4,
"hours_per_run": 4.2,
"gpu_type": "T4",
"platform": "Modal",
"cost_per_run": 2.48,
"total_training_cost": 9.92
},
"inference": {
"deployment_type": "serverless",
"expected_requests_month": 30000,
"gpu_type": "T4",
"platform": "Modal",
"monthly_cost": 9.90
},
"storage": {
"model_artifacts_gb": 14,
"dataset_storage_gb": 5,
"monthly_storage_cost": 0.50
},
"total_monthly_cost": 20.32,
"breakdown_percentage": {
"training": 48.8,
"inference": 48.7,
"storage": 2.5
}
},
"cost_optimizations_applied": {
"peft_lora": "50% training cost reduction",
"mixed_precision": "2x faster training",
"serverless_inference": "Pay only for actual usage",
"batch_inference": "Up to 10x reduction in inference cost"
},
"potential_savings": {
"without_optimizations": 45.00,
"with_optimizations": 20.32,
"total_savings": 24.68,
"savings_percentage": 54.8
}
}
Platform Pricing Data
Template: templates/platform-pricing.yaml
platforms:
modal:
billing: per-second
free_credits: 30 # USD per month
startup_credits: 50000 # USD for eligible startups
gpus:
t4:
price_per_sec: 0.000164
price_per_hour: 0.59
vram_gb: 16
a100_40gb:
price_per_sec: 0.000583
price_per_hour: 2.10
vram_gb: 40
h100:
price_per_sec: 0.001097
price_per_hour: 3.95
vram_gb: 80
lambda:
billing: per-hour
free_credits: 0
minimum_billing: 1-hour
gpus:
a10_1x:
price_per_hour: 0.31
vram_gb: 24
a100_40gb_1x:
price_per_hour: 1.29
vram_gb: 40
a100_40gb_8x:
price_per_hour: 10.32
total_vram_gb: 320
runpod:
billing: per-minute
free_credits: 0
features:
- zero_egress_fees
- flashboot_200ms
gpus:
t4:
price_per_hour: 0.60 # Approximate
vram_gb: 16
Cost Estimation Examples
Example 1: Training 7B Model
File: examples/training-cost-estimate.md
Scenario:
- Model: Llama 2 7B fine-tuning
- Dataset: 10,000 samples (5M tokens)
- Epochs: 3
- Total tokens: 15M
- Method: LoRA/PEFT
Cost Calculation:
bash scripts/estimate-training-cost.sh \
--model-size 7B \
--dataset-size 10000 \
--epochs 3 \
--gpu t4 \
--platform modal \
--peft yes
Results:
Training Time: 4.2 hours
Modal T4 Cost: $2.48
Alternative (Lambda A10): $1.30 (47% cheaper)
Optimization Impact:
- Without PEFT: $12.40 (5x more expensive)
- With PEFT: $2.48
- Savings: $9.92 (80%)
Recommendation: Use Lambda A10 for cheapest option, or Modal T4 for serverless convenience.
Example 2: Production Inference
File: examples/inference-cost-estimate.md
Scenario:
- Model: Custom 7B classifier
- Expected traffic: 1,000 requests/day
- Avg latency: 2 seconds per request
- Growth: 10x in 6 months
Cost Calculation:
bash scripts/estimate-inference-cost.sh \
--requests-per-day 1000 \
--avg-latency 2 \
--gpu t4 \
--platform modal \
--deployment serverless
Current (1K requests/day):
Serverless Modal T4:
- Daily cost: $0.33
- Monthly cost: $9.90
- Cost per request: $0.00033
Dedicated Lambda A10:
- Monthly cost: $223 (24/7 instance)
- Break-even: 2,250 requests/day
- Not recommended for current traffic
After Growth (10K requests/day):
Serverless Modal T4:
- Monthly cost: $99.00
- Still cost-effective
Dedicated Lambda A10:
- Monthly cost: $223
- Break-even reached at 2,250 requests/day
- Recommendation: Stay serverless until 10K+ daily
Cost Optimization Strategies
1. Use PEFT/LoRA
Savings: 50-90% training cost reduction
# Calculate savings
bash scripts/estimate-training-cost.sh --model-size 7B --peft no
# Cost: $12.40
bash scripts/estimate-training-cost.sh --model-size 7B --peft yes
# Cost: $2.48
# Savings: $9.92 (80%)
2. Mixed Precision Training
Savings: 2x faster training, 50% cost reduction
Automatically enabled in cost estimations with --mixed-precision yes
3. Platform Selection
Use Case Guidelines:
# Short jobs (<1 hour): Modal serverless
bash scripts/compare-platforms.sh --training-hours 0.5 --gpu-type t4
# Winner: Modal ($0.30 vs Lambda $0.31 minimum)
# Long jobs (4+ hours): Lambda dedicated
bash scripts/compare-platforms.sh --training-hours 4 --gpu-type a100-40gb
# Winner: Lambda ($5.16 vs Modal $8.40)
# Variable workloads: Modal serverless
# Pay only for actual usage, no idle cost
4. Batch Inference
Savings: Up to 10x reduction in inference cost
# Single inference
bash scripts/estimate-inference-cost.sh \
--requests-per-day 1000 \
--avg-latency 2 \
--batch-inference no
# Cost: $9.90/month
# Batch inference (10 requests per batch)
bash scripts/estimate-inference-cost.sh \
--requests-per-day 1000 \
--avg-latency 0.3 \
--batch-inference yes
# Cost: $1.49/month
# Savings: $8.41 (85%)
Quick Reference: Cost Per Use Case
Small Model Training (< 1B params)
- Best GPU: T4
- Best Platform: Modal (serverless)
- Typical Cost: $0.50-$2.00 per run
- Time: 30 min - 2 hours
Medium Model Training (1B-7B params)
- Best GPU: T4 (with PEFT) or A100 40GB
- Best Platform: Lambda A10 (cheapest) or Modal T4 (convenience)
- Typical Cost: $1.00-$8.00 per run
- Time: 2-8 hours
Large Model Training (7B-70B params)
- Best GPU: A100 80GB or H100 (with PEFT)
- Best Platform: Lambda (dedicated) or Modal (serverless)
- Typical Cost: $10-$100 per run
- Time: 8-48 hours
Low-Traffic Inference (<1K requests/day)
- Best Deployment: Modal serverless
- Best GPU: T4
- Typical Cost: $5-$15/month
High-Traffic Inference (>10K requests/day)
- Best Deployment: Dedicated or batch serverless
- Best GPU: A10 or A100
- Typical Cost: $100-$500/month
Dependencies
Required for scripts:
# Bash 4.0+ (for associative arrays)
bash --version
# jq (for JSON processing)
sudo apt-get install jq
# bc (for floating-point calculations)
sudo apt-get install bc
# yq (for YAML processing)
pip install yq
Best Practices Summary
- Always estimate before training - Use cost scripts to avoid surprises
- Use PEFT for large models - 50-90% cost savings
- Enable mixed precision - 2x speedup with no quality loss
- Choose platform based on workload:
- Modal: Serverless, short jobs, variable workloads
- Lambda: Long-running, dedicated, multi-GPU
- RunPod: Per-minute billing flexibility
- Batch inference when possible - Up to 10x cost reduction
- Apply for startup credits - Modal offers $50K free
- Monitor actual costs - Compare estimates to actuals, optimize
- Use smallest viable GPU - T4 often sufficient with PEFT
Supported Platforms: Modal, Lambda Labs, RunPod GPU Types: T4, L4, A10, A100 (40GB/80GB), H100, H200, B200 Output Format: JSON cost breakdowns and markdown reports Version: 1.0.0