Design of Experiments (DOE)

Overview

DOE systematically varies process factors to identify their effects on responses. Full factorial tests all combinations; fractional factorial tests a strategic subset. Identifies main effects and interactions. More efficient than one-factor-at-a-time (OFAT) which misses interactions. Uses ANOVA for analysis.

When to Use

Trigger conditions:

Identifying which process factors significantly affect quality/yield
Optimizing process settings for target performance
Screening many factors to find the vital few

When NOT to use:

When the process is not stable (stabilize with SPC first)
For observational data with no ability to manipulate factors

Algorithm

IRON LAW: One-Factor-At-A-Time (OFAT) MISSES Interactions
Changing one factor while holding others fixed cannot detect
interactions (where the effect of A depends on the level of B).
Full factorial or fractional factorial designs test ALL main effects
AND interactions in fewer runs than OFAT. A 2³ factorial (8 runs)
gives more information than 6 OFAT runs at lower cost.

Phase 1: Input Validation

Define: response variable(s), factors (2-7 practical), levels per factor (usually 2 for screening, 3 for optimization), constraints, noise factors. Gate: Factors and levels defined, practical to run all experimental conditions.

Phase 2: Core Algorithm

Screening (many factors): 2^(k-p) fractional factorial. Choose resolution III+ (main effects not confounded with each other).

Optimization (few factors): 2^k full factorial or central composite design (CCD) for response surface.

Generate design matrix (run order, factor level assignments)
Randomize run order (critical for validity)
Execute experiments, record responses
Analyze: ANOVA for factor significance, effect plots, interaction plots
If optimizing: fit response surface model, find optimal settings

Phase 3: Verification

Check: R² of model is adequate, residuals are normally distributed and random. Confirmation runs at predicted optimal settings match prediction. Gate: Model is significant, residuals OK, confirmation runs pass.

Phase 4: Output

Return significant factors, effects, and optimal settings.

Output Format

{
  "significant_factors": [{"factor": "temperature", "effect": 12.5, "p_value": 0.001}, {"factor": "pressure", "effect": -8.2, "p_value": 0.008}],
  "interactions": [{"factors": "temperature×time", "effect": 5.1, "p_value": 0.03}],
  "optimal": {"temperature": 180, "pressure": 50, "time": 30, "predicted_response": 95.2},
  "metadata": {"design": "2^3_full_factorial", "runs": 8, "replicates": 2, "r_squared": 0.94}
}

Examples

Sample I/O

Input: 3 factors (temperature, pressure, time), each at 2 levels, response = yield Expected: 2³ = 8 runs + replicates. ANOVA reveals temperature and temp×pressure interaction are significant.

Edge Cases

Input	Expected	Why
7+ factors	Fractional factorial	Full factorial too expensive (2⁷=128 runs)
Factors with constraints	Constrained design	Some factor combinations may be physically impossible
Non-linear response	CCD or Box-Behnken	2-level designs only fit linear models

Gotchas

Randomization is critical: Without randomization, time-varying factors (operator fatigue, ambient temperature) confound results. ALWAYS randomize run order.
Replication vs repetition: Replication (re-setup and re-run) estimates error. Repetition (multiple measurements from one run) does not. Include true replicates.
Alias structure: Fractional factorials confound some effects. Know which effects are aliased (confounded) before interpreting results.
Center points: Adding center points to a 2-level design detects curvature (non-linearity) at minimal cost. Always include 3-5 center points.
Practical significance vs statistical significance: A factor can be statistically significant (p<0.05) but practically unimportant (tiny effect). Focus on effect SIZE, not just p-values.

References

For fractional factorial design tables, see references/fractional-tables.md
For response surface methodology (RSM), see references/rsm.md

algo-mfg-doe