fermi-estimation
Fermi Estimation
Produce an answer that is numerically useful, transparent about uncertainty, and persuasive to a skeptical reader. Favor externally checkable facts over intuition, and turn intuition into labeled assumptions only after exhausting accessible evidence.
Core Workflow
- Define the target precisely.
- Gather accessible evidence before inventing assumptions.
- Decompose the target into a small number of drivers.
- Assign low/base/high values to each driver.
- Calculate the estimate and stress-test it.
- Present the conclusion, uncertainty, and biggest drivers.
If the user says to solve by Fermi estimation, do not stop for approval. Work through to a final answer with the best evidence you can access.
When Not To Use It
Do not use Fermi estimation when a current primary-source answer is directly available with a quick lookup.
Prefer exact lookup first when the answer can be retrieved quickly from:
- Official statistics or regulator datasets
- Public filings or company disclosures
- Current pricing, usage, or policy pages
- A user-provided source that directly answers the question
Use Fermi estimation as a fallback or cross-check when direct measurement is unavailable, stale, fragmented, or too slow.
Step 1: Define The Quantity
Lock down these items explicitly in the answer, even if you infer them:
- Geography
- Time window
- Unit of measure
- Inclusion and exclusion rules
If the request is ambiguous, choose the most decision-useful interpretation, state it, and continue.
If unresolved scope ambiguity is likely to change the estimate materially, either:
- present two scoped estimates, or
- ask one critical clarification.
Otherwise, infer the most decision-useful scope and continue.
Step 2: Evidence First
Use the strongest accessible evidence in this order:
- User-provided files, URLs, and constraints
- Official statistics, regulator data, public filings, company disclosures, product pricing pages
- Industry reports or reputable research summaries
- Stable background facts already known with high confidence
- Heuristic assumptions labeled as assumptions
Do not ask the user for permission to search or proceed. Use available tools and public sources proactively.
For source strategy and common decomposition patterns, read fermi-estimation/references/evidence-patterns.md.
Step 3: Build A Sparse Model
Prefer a model with 3-7 drivers. More factors usually create fake precision.
Good patterns:
- Population x penetration x frequency x price
- Locations x utilization x throughput
- Employees x time per task x tasks per period
- Revenue pool x reachable share x conversion
- Total demand = segment A + segment B + segment C
For each driver, record:
- Name
- Base value
- Low/high range
- Unit
- Why the value is plausible
- Best supporting source or assumption label
Step 4: Quantify Uncertainty
Always provide a range unless the user explicitly wants a point estimate only.
base: best estimatelow: conservative but plausiblehigh: optimistic but plausible
Use narrow ranges for sourced facts and wider ranges for inferred behavior. Never hide uncertainty behind a single precise number.
If the model is multiplicative, additive, or nested (for example sum-of-products), and you want consistent arithmetic, use fermi-estimation/scripts/factor_model.py.
Example:
python3 fermi-estimation/scripts/factor_model.py --input factors.json --format markdown
The script accepts either a JSON file or inline JSON and returns low/base/high totals plus validation errors for invalid inputs, rollups, scenario totals, one-at-a-time sensitivity, correlated-group stress tests, concrete sanity checks, and optional Monte Carlo summaries.
Optional scenario support:
- Add
scenarios.conservativeandscenarios.aggressiveon a factor when you want scenario values that differ from literallowandhigh - Add
correlation_groupon related factors when they should be stress-tested together - Add
correlation_directionaspositiveornegativewhen a driver moves opposite the rest of its group - Add
correlation_strengthfrom0to1when the group move should be partial rather than full - Or add
correlation: {group, direction, strength}on a group to inherit those settings across many factors - Add
tagson factors andcorrelation.apply_toon a group when inherited correlation should affect only a subset of drivers - Add top-level
sanity_checksentries when you want the report to include explicit top-down, capacity, budget, or benchmark checks alongside the script's built-in checks - Add
monte_carlo: {enabled, samples, seed, correlated_groups}or pass--samples/--seedwhen you want simulated percentile output in addition to deterministic bounds
When monte_carlo.correlated_groups is enabled, factors in the same correlation group are sampled with shared group quantiles, adjusted by each factor's direction and strength.
Step 5: Stress-Test The Result
Before presenting the final answer, run at least two checks:
- Bottom-up vs top-down comparison, if both are possible
- Capacity or budget realism check
- Compare against a known benchmark, adjacent market, or public company metric
- Check unit consistency and time consistency
If checks disagree materially, explain the mismatch and either revise the model or present both bounds with the reason for divergence.
Step 6: Write The Answer
Use this structure unless the user asks for another format:
- Bottom line
- Model summary
- Key assumptions and evidence
- Calculation table
- Sanity checks
- Sensitivity and confidence
For a concise template, read fermi-estimation/references/report-template.md.
Quality Bar
- Show assumptions explicitly; never bury them
- Distinguish sourced inputs from inferred inputs
- Prefer orders of magnitude that are memorable and decision-useful
- Round final answers to a believable precision level
- State the as-of date when current data matters
- If an input is weak, say it is weak and show its effect on the result
Failure Modes To Avoid
- Using too many factors with weak justification
- Multiplying percentages and counts with mismatched populations
- Mixing monthly, annual, and daily units
- Adding segments with mismatched currency, geography, or time basis
- Treating one anecdote as a market-wide fact
- Presenting a point estimate without a plausible range
- Asking the user for unnecessary confirmation before concluding