tinker-training-cost
Tinker Training Cost Calculator
Calculate training costs for Tinker fine-tuning jobs by tokenizing your dataset with the correct model tokenizer and applying current pricing.
Quick Start
Use the bundled script to calculate training costs:
# List available models and pricing
python scripts/calculate_cost.py --list-models
# Calculate cost for a JSONL dataset
python scripts/calculate_cost.py training_data.jsonl --model Qwen3-8B --epochs 3
# Output as JSON
python scripts/calculate_cost.py training_data.jsonl --model Llama-3.1-70B --json
The script:
- Loads the correct tokenizer for the selected model
- Counts tokens in your JSONL file (supports chat, text, and instruction formats)
- Calculates the estimated training cost
Cost Formula
Training Cost = (total_tokens × epochs × train_price_per_million) / 1_000_000
Where:
total_tokens= tokens in your training dataset (from tokenization)epochs= number of training passes (default: 3)train_price_per_million= model-specific training rate from pricing table
Tinker Pricing
All prices as of January 5, 2026 Source: https://thinkingmachines.ai/tinker/
All prices are in USD per million tokens.
| Category | Description |
|---|---|
| Prefill | Processing input context (inference) |
| Sample | Generating output tokens (inference) |
| Train | Training/fine-tuning tokens |
Qwen Models
| Model | Prefill | Sample | Train |
|---|---|---|---|
| Qwen3-4B-Instruct-2507 | $0.07 | $0.22 | $0.22 |
| Qwen3-8B | $0.13 | $0.40 | $0.40 |
| Qwen3-30B-A3B | $0.12 | $0.30 | $0.36 |
| Qwen3-VL-30B-A3B-Instruct | $0.18 | $0.44 | $0.53 |
| Qwen3-32B | $0.49 | $1.47 | $1.47 |
| Qwen3-235B-Instruct-2507 | $0.68 | $1.70 | $2.04 |
| Qwen3-VL-235B-A22B-Instruct | $1.02 | $2.56 | $3.07 |
Llama Models
| Model | Prefill | Sample | Train |
|---|---|---|---|
| Llama-3.2-1B | $0.03 | $0.09 | $0.09 |
| Llama-3.2-3B | $0.06 | $0.18 | $0.18 |
| Llama-3.1-8B | $0.13 | $0.40 | $0.40 |
| Llama-3.1-70B | $1.05 | $3.16 | $3.16 |
DeepSeek Models
| Model | Prefill | Sample | Train |
|---|---|---|---|
| DeepSeek-V3.1 | $1.13 | $2.81 | $3.38 |
GPT-OSS Models
| Model | Prefill | Sample | Train |
|---|---|---|---|
| GPT-OSS-120B | $0.18 | $0.44 | $0.52 |
| GPT-OSS-20B | $0.12 | $0.30 | $0.36 |
Moonshot Models
| Model | Prefill | Sample | Train |
|---|---|---|---|
| Kimi-K2-Thinking | $0.98 | $2.44 | $2.93 |
Model-to-Tokenizer Mapping
Use the correct HuggingFace tokenizer for accurate token counting:
| Model | HuggingFace Tokenizer |
|---|---|
| Qwen3-4B-Instruct-2507 | Qwen/Qwen3-4B |
| Qwen3-8B | Qwen/Qwen3-8B |
| Qwen3-30B-A3B | Qwen/Qwen3-30B-A3B |
| Qwen3-32B | Qwen/Qwen3-32B |
| Qwen3-235B-Instruct-2507 | Qwen/Qwen3-235B-A22B-Instruct |
| Qwen3-VL-* | Qwen/Qwen2.5-VL-7B-Instruct (shared VL tokenizer) |
| Llama-3.2-1B | meta-llama/Llama-3.2-1B-Instruct |
| Llama-3.2-3B | meta-llama/Llama-3.2-3B-Instruct |
| Llama-3.1-8B | meta-llama/Llama-3.1-8B-Instruct |
| Llama-3.1-70B | meta-llama/Llama-3.1-70B-Instruct |
| DeepSeek-V3.1 | deepseek-ai/DeepSeek-V3 |
| GPT-OSS-* | Qwen/Qwen3-8B (compatible tokenizer) |
| Kimi-K2-Thinking | moonshotai/Kimi-K2-Instruct |
Tokenization
The bundled scripts/calculate_cost.py handles tokenization automatically. For custom use:
from transformers import AutoTokenizer
# Load the correct tokenizer for your model
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B", trust_remote_code=True)
# Count tokens
token_count = len(tokenizer.encode("Your training text here"))
Supported JSONL Formats
The script handles these training data formats:
Chat format (recommended):
{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}
Text format:
{"text": "Your training text here"}
Instruction format (Alpaca-style):
{"instruction": "...", "input": "...", "output": "..."}
Quick Cost Examples
Example 1: Qwen3-8B on 1M tokens, 3 epochs
Dataset tokens: 1,000,000
Training tokens: 1,000,000 × 3 = 3,000,000
Cost: 3.0M × $0.40/M = $1.20
Example 2: Llama-3.1-70B on 5M tokens, 2 epochs
Dataset tokens: 5,000,000
Training tokens: 5,000,000 × 2 = 10,000,000
Cost: 10.0M × $3.16/M = $31.60
Example 3: Qwen3-235B on 2M tokens, 4 epochs
Dataset tokens: 2,000,000
Training tokens: 2,000,000 × 4 = 8,000,000
Cost: 8.0M × $2.04/M = $16.32
Important Notes
- LoRA Fine-Tuning: Tinker uses Low-Rank Adaptation (LoRA), not full fine-tuning
- Free Credits: New users receive $150 in free credits
- Token Counting: Always use the model's native tokenizer for accurate counts - different tokenizers produce different token counts for the same text
- Vision Models: VL models have higher costs due to image processing overhead
- trust_remote_code: Required for some tokenizers (Qwen, DeepSeek)
More from m4n5ter/skills
ipynb-notebooks
面向 .ipynb Notebook(Jupyter / JupyterLab / Google Colab / VS Code)的创建、审阅、重构与展示。涵盖工程化目录结构、token 高效处理、演示/分享模式、以及 uv/venv 可复现工作流。
16jj-vcs
面向 Jujutsu(jj) 版本控制的使用、工作流、revset/fileset 语法、Git 互操作与配置排错指导。用于解答 jj 命令与概念差异、迁移 Git 流程到 jj、处理冲突/回滚、配置与远程书签相关问题。
1docx
全面的文档创建、编辑和分析,支持修订(tracked changes)、批注、格式保留和文本提取。当需要处理专业文档(.docx 文件)用于:(1)创建新文档,(2)修改或编辑内容,(3)处理修订,(4)添加批注,或任何其他文档任务时使用。
1agent-browser
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.
1training-data-curation
Guidelines for creating high-quality datasets for LLM post-training (SFT/DPO/RLHF). Use when preparing data for fine-tuning, evaluating data quality, or designing data collection strategies.
1design-postgres-tables
Comprehensive PostgreSQL-specific table design reference covering data types, indexing, constraints, performance patterns, and advanced features
1