OphirIAn

02 — Methodological Guide

Experimental Design
for Industrial
Processes

Statistical design of experiments (DOE) is the scientific foundation of reproducible industrial optimization. This guide defines principles, methodologies, and implementation frameworks to apply DOE in Colombian and Latin American production environments, with emphasis on resource efficiency and statistical validity in contexts with limited instrumentation capacity.

Juliana Quintana

Executive Director & Director of ML and AI Research

OphirIAn

40%

Average variability reduction with systematic DOE

3–7×

Typical ROI of DOE projects in manufacturing

6σ

Quality target reachable with integrated DOE+DMAIC

01 — Scientific Foundations

Why DOE in Industry?

Statistical Design of Experiments (DOE) is a core scientific methodology to understand, control, and optimize complex production systems. Unlike one-factor-at-a-time empirical tuning (OFAT), DOE enables simultaneous exploration of multiple factors and interactions with proven statistical efficiency.

Montgomery (2017), in Design and Analysis of Experiments - the global reference text in this field - documents that full and fractional factorial designs can identify significant factors with 50% to 80% fewer runs than OFAT (One Factor At a Time) approaches, while preserving equivalent statistical power and detecting factor interactions that classical approaches systematically miss.

In the Latin American industrial context, where experimental resources are limited and pilot testing costs are significant, DOE efficiency is not just a methodological advantage: it is an operational requirement. Antony (2014) estimates that organizations implementing systematic DOE reduce process-improvement cycles by 40-60%.

Factorial designs
2 levels, k factors

RSM

Response Surface
Methodology · Box-Behnken / CCD

Taguchi

Robust designs
for manufacturing

[1] Montgomery DC. (2017). Design and Analysis of Experiments (9th ed.). John Wiley & Sons. ISBN: 978-1-119-11347-8.

[2] Antony J. (2014). Design of Experiments for Engineers and Scientists (2nd ed.). Elsevier. doi:10.1016/B978-0-08-099417-8.00001-9

[3] Box GEP, Hunter JS, Hunter WG. (2005). Statistics for Experimenters (2nd ed.). Wiley-Interscience.

02 — Methodological Framework

Taxonomy of Experimental
Designs

Selecting the right experimental design depends on research objective, number of factors, available resources, and response variable type. OphirIAn uses a four-level methodological decision tree aligned with ISO 5725 standards for measurement-method accuracy and precision.

Design	Objective	Factors	Minimum runs	Typical application
Screening (Plackett-Burman)	Identify vital factors	5-20	N+1	Initial process exploration
Factorial 2k	Main effects + interactions	2-7	2k	Critical-parameter optimization
Fractional Factorial 2k-p	Efficiency with many factors	5-15	2k-p	Formulation and industrial recipes
RSM - CCD / Box-Behnken	Response surface and optimum	2-5	Variable	Maximize/minimize quality KPI
Taguchi L-Array	Robustness to noise	3-8	Orthogonal	Manufacturing quality control
D-Optimal	Irregular regions and constraints	Variable	Computational	Constrained mixture processes

For agroindustrial and light-manufacturing processes - the core of the Colombian MSME ecosystem - Box-Behnken and Central Composite Design (CCD) under Response Surface Methodology (RSM) provide the best balance between statistical power, number of runs, and objective-function modeling capacity.

[4] Myers RH, Montgomery DC, Anderson-Cook CM. (2016). Response Surface Methodology (4th ed.). Wiley. ISBN: 978-1-118-91601-8.

[5] Plackett RL, Burman JP. (1946). The design of optimum multifactorial experiments. Biometrika, 33(4), 305–325.

[6] ISO 5725-1:2023. Accuracy of measurement methods and results. Geneva: ISO.

03 — Implementation Protocol

DOE Pipeline in Industrial
Environments

OphirIAn developed a five-phase DOE implementation pipeline that combines scientific rigor with real operational constraints in Colombian industrial plants: variable equipment, limited historical data, and operators without formal statistical training.

Problem and Variable Definition

Identify response variable (Y) through process analysis and VOC (Voice of the Customer). Map controllable factors, noise sources, and operational constraints. Validate the measurement system with R&R studies (Repeatability & Reproducibility) under MSA (Measurement System Analysis). Without a reliable measurement system, any DOE will produce misleading conclusions.

Deliverable: MSA Report + Variable Map

Screening and Pre-Experiment

For processes with more than 5 potential factors, a screening design (Plackett-Burman or Resolution III fractional factorial) is applied to identify the vital few (effect Pareto). This phase reduces the experimental space by 60-80% before formal optimization, preserving critical resources in MSME environments.

Design: Plackett-Burman / 2k-p Res. III

Design and Execution of the Main Experiment

Build the optimal design with identified significant factors. Fully randomize run order to control confounding. Add blocks for uncontrollable noise factors (shifts, lots, operators). Add replicates to estimate pure experimental error. Execute with real-time statistical process control (SPC).

Standard: ASTM E2281 / ISO 3534-3

Statistical Analysis and Modeling

Use multifactor ANOVA with assumption checks (Shapiro-Wilk normality, Levene homogeneity, DW independence). Build polynomial regression models. Perform cross-validation and residual analysis. For nonlinear behavior: use robust regression (Huber) or Gaussian Process-based models when response space is complex.

Software: R/Python + Minitab/JMP

Optimization and Verification

Locate optimum using desirability functions (Derringer & Suich, 1980) for multi-response settings. Confirm experimentally with verification runs (minimum n=5). Run sensitivity analysis and 95% confidence intervals around predicted optimum. Standardize and statistically control the optimal operating point.

Deliverable: Operating Protocol + SPC Chart

[7] Derringer G, Suich R. (1980). Simultaneous optimization of several response variables. Journal of Quality Technology, 12(4), 214–219.

[8] ASTM E2281-15. (2015). Standard Practice for Process and Measurement Capability Indices. ASTM International.

[9] ISO 3534-3:2021. Statistics — Vocabulary and symbols — Part 3: Design of experiments. Geneva: ISO.

[10] Automotive Industry Action Group (AIAG). (2010). Measurement Systems Analysis (MSA) Reference Manual (4th ed.).

04 — ML Integration

DOE + Machine Learning:
The Hybrid Paradigm

The current methodological frontier in industrial optimization combines classical DOE with machine-learning methods to overcome limitations in high-dimensional spaces, complex nonlinear relationships, and noisy process data. This hybrid paradigm, known as Model-Based Design of Experiments (MBDoE), is central to the OphirIAn approach.

Shahriari et al. (2016), in Bayesian Optimization (IEEE Proceedings), and Snoek et al. (2012) showed that Bayesian optimization with Gaussian Processes outperforms classical DOE in continuous parameter spaces with 5+ factors, reducing required evaluations by up to 70% to reach global optima while estimating predictive uncertainty simultaneously.

Classical DOE Approach

Strengths: Confounding control, efficient main-effect estimation, solid statistical grounding, direct interpretability.

Limitations: Linearity assumptions, combinatorial explosion in high dimensions, discretized response curves.

ML / Bayesian Approach

Strengths: Nonlinearity handling, sequential updates (active learning), uncertainty estimation, high-dimensional capability.

Limitations: Requires more initial data, lower causal interpretability, overfitting risk without regularization.

Integrated OphirIAn Pipeline

Phase 1: DOE screening (Plackett-Burman) -> identify vital factors.
Phase 2: Centered factorial RSM -> build base polynomial model.
Phase 3: Gaussian Process Regression on residuals -> capture nonlinearities.
Phase 4: Bayesian optimization with acquisition function (EI / UCB) -> explore global optimum.
Phase 5: Experimental confirmation + SPC control -> transfer to production.

Classical DOE is the map;
machine learning is the real-time navigator.

[11] Shahriari B et al. (2016). Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE, 104(1), 148–175. doi:10.1109/JPROC.2015.2494218

[12] Snoek J, Larochelle H, Adams RP. (2012). Practical Bayesian Optimization of Machine Learning Algorithms. NeurIPS 2012.

[13] Gregorutti B et al. (2017). Correlation and variable importance in random forests. Stat Comput, 27, 659–678.

[14] Forrester AIJ, Keane AJ. (2009). Recent advances in surrogate-based optimization. Prog Aerosp Sci, 45(1-3), 50–79.

[15] Garud SS et al. (2017). Design of computer experiments: A review. Comput Chem Eng, 106, 71–95.

Experimental Designfor IndustrialProcesses

Why DOE in Industry?

Taxonomy of ExperimentalDesigns

DOE Pipeline in IndustrialEnvironments

DOE + Machine Learning:The Hybrid Paradigm

Experimental Design
for Industrial
Processes

Taxonomy of Experimental
Designs

DOE Pipeline in Industrial
Environments

DOE + Machine Learning:
The Hybrid Paradigm