You are here

Design and Analysis of Experiments with R

John Lawson
Chapman & Hall/CRC
Publication Date: 
Number of Pages: 
Texts in Statistical Science
[Reviewed by
Robert W. Hayden
, on

A course in the design of experiments would be a common part of a statistics major. Such a course can be designed to require only an introductory statistics course as a prerequisite, which means it would also be a good candidate for inclusion in a statistics minor. Course content would not be nearly as broad as the title suggests. One would find little reference to experiments discussed in a physics textbook or a history of science. Instead, such a course will focus on experiments in the softer sciences where the effect we are trying to measure may well be drowned out by random noise. You will not learn how to create designs from scratch but rather to recognize less than a dozen prefabricated designs that have been found useful in practice. All will use some form of Analysis of Variance (ANOVA) to draw conclusions. This means that you will normally have a single quantitative dependent variable while your independent variables will be categorical. Any quantitative independent variables will be treated as categorical by investigating only a small number of fixed levels. For example, fertilizer could be applied to plots in any amount, but you may only study 0, 100, 200, and 300 units per acre. Conclusions will compare average crop yields at the different levels but not try to determine a mathematical model for the relationships between the variables. Finally, the process of randomly assigning treatments to subjects can get very challenging in complex designs, so a design of experiments course would typically include many prefabricated random assignments.

So how does the book at hand fit into that classic mold? The title gives one clue. The author uses the statistical programming language R to analyze all the designs discussed. While using some software has been standard for decades, this book goes a step farther than most such textbooks in doing away with all the pre-software material. Hence all the algebra and arithmetic involved in cranking out the numbers is gone. So too are prefabricated randomizations; those too are handled by R.

The principal arguments for using R as the computing environment are that it is free of cost, extremely powerful, and becoming a de facto standard in statistical computing. The principal disadvantage of R is that it is a programming language, not an application with a menu item for every design. In the case of this book, an additional point in favor of R is that the author constantly uses R to extend analyses in ways that would be difficult in any menu-driven software package. (An unintended consequence might be a reader trying things until getting a “significant” result, against which the author might have issued more warnings.)

In addition to using demanding software, this book has demanding prerequisites. The author suggests an introductory statistics course as well as courses in calculus, mathematical statistics, and linear statistical models, plus familiarity with the language of matrices and working with a command-line interface. That probably puts this text near the top as regards its level, though a reader interested primarily in applying the methods could probably get by with a statistics background that included ANOVA and enough courage to face a command-line unarmed.

Less central differences between this book and others include a bias toward industrial applications rather than the usual bias toward agriculture or the social sciences. There are non-standard chapters on the uses of experimental design introduced to quality control by Taguchi, experiments to determine sources of variability (which also has applications in quality control), response surface studies often used for optimization, mixture designs used where the independent variables are components in a mixture (think ingredients in a product), and strategies for using successive experiments to gradually home in on desired results. The chapters on standard topics include a wide variety of applications, however, so this book should not be viewed as strictly an “experimental design for engineers” text, though it could serve that purpose.

Assuming the approach described above fits one’s needs, the author does an excellent job of implementing it. Explanations are clear and to the point. Analyses are at a greater depth than most textbooks, and more consistent with what is needed in practice. Along the way the author provides many excellent insights into the methods and their application, again all too rare. On the surface, the exercises appear to be at a much lower level than the text. Often the student is asked to run analyses similar to those in the text, which R makes easy. However, the student is also often asked to think carefully about designing a study, or interpreting the results of a study, non-trivial tasks. Exercises are often based on real studies, and these are drawn from a variety of disciplines. An interesting feature is flowchart-like diagrams that help the reader match a situation to a design. These are well done and useful, though your reviewer is not nearly as optimistic as the author that the book includes a design for every situation.

This is an excellent but demanding text. On the other hand, the more modest goal of simply helping students commit statistical mayhem with the aid of a computer is not always preferable. This book should be mandatory reading for anyone teaching a course in the statistical design of experiments. They may then decide whether it will work for them as a student text. Even if not, reading this text is likely to influence their course for the better.

After a few years in industry, Robert W. Hayden ( taught mathematics at colleges and universities for 32 years and statistics for 20 years. In 2005 he retired from full-time classroom work. He now teaches statistics online at and does summer workshops for high school teachers of Advanced Placement Statistics. He contributed the chapter on evaluating introductory statistics textbooks to the MAA's Teaching Statistics.

Statistics and Data Collection
Beginnings of Statistically Planned Experiments
Definitions and Preliminaries
Purposes of Experimental Design
Types of Experimental Designs
Planning Experiments
Performing the Experiments
Use of R Software

Completely Randomized Designs with One Factor
Replication and Randomization
A Historical Example
Linear Model for Completely Randomized Design (CRD)
Verifying Assumptions of the Linear Model
Analysis Strategies When Assumptions Are Violated
Determining the Number of Replicates
Comparison of Treatments after the F-Test

Factorial Designs
Classical One at a Time versus Factorial Plans
Interpreting Interactions
Creating a Two-Factor Factorial Plan in R
Analysis of a Two-Factor Factorial in R
Factorial Designs with Multiple Factors—Completely Randomized Factorial Design (CRFD)
Two-Level Factorials
Verifying Assumptions of the Model

Randomized Block Designs
Creating a Randomized Complete Block (RCB) Design in R
Model for RCB
An Example of a RCB
Determining the Number of Blocks
Factorial Designs in Blocks
Generalized Complete Block Design
Two Block Factors Latin Square Design (LSD)

Designs to Study Variances
Random Sampling Experiments (RSE)
One-Factor Sampling Designs
Estimating Variance Components
Two-Factor Sampling Designs—Factorial RSE
Nested SE
Staggered Nested SE
Designs with Fixed and Random Factors
Graphical Methods to Check Model Assumptions

Fractional Factorial Designs
Introduction to Completely Randomized Fractional Factorial (CRFF)
Half Fractions of 2k Designs
Quarter and Higher Fractions of 2k Designs
Criteria for Choosing Generators for 2k-p Designs
Augmenting Fractional Factorials
Plackett–Burman (PB) Screening Designs
Mixed-Level Fractional Factorials Orthogonal Array (OA)
Definitive Screening Designs

Incomplete and Confounded Block Designs
Balanced Incomplete Block (BIB) Designs
Analysis of Incomplete Block Designs
Partially Balanced Incomplete Block (PBIB) Designs—Balanced Treatment Incomplete Block (BTIB)
Row Column Designs
Confounded 2k and 2k-p Designs
Confounding 3 Level and p Level Factorial Designs
Blocking Mixed-Level Factorials and OAs
Partially CBF

Split-Plot Designs
Split-Plot Experiments with CRD in Whole Plots (CRSP)
RCB in Whole Plots (RBSP)
Analysis Unreplicated 2k Split-Plot Designs
2k-p Fractional Factorials in Split Plots (FFSP)
Sample Size and Power Issues for Split-Plot Designs

Crossover and Repeated Measures Designs
Crossover Designs (COD)
Simple AB, BA Crossover Designs for Two Treatments
Crossover Designs for Multiple Treatments
Repeated Measures Designs
Univariate Analysis of Repeated Measures Design

Response Surface Designs
Fundamentals of Response Surface Methodology
Standard Designs for Second-Order Models
Creating Standard Response Surface Designs in R
Non-Standard Response Surface Designs
Fitting the Response Surface Model with R
Determining Optimum Operating Conditions
Blocked Response Surface (BRS) Designs 
Response Surface Split-Plot (RSSP) Designs

Mixture Experiments
Models and Designs for Mixture Experiments
Creating Mixture Designs in R
Analysis of Mixture Experiment
Constrained Mixture Experiments
Blocking Mixture Experiments
Mixture Experiments with Process Variables
Mixture Experiments in Split-Plot Arrangements

Robust Parameter Design Experiments
Noise Sources of Functional Variation
Product Array Parameter Design Experiments
Analysis of Product Array Experiments
Single Array Parameter Design Experiments
Joint Modeling of Mean and Dispersion Effects

Experimental Strategies for Increasing Knowledge
Sequential Experimentation
One-Step Screening and Optimization
An Example of Sequential Experimentation
Evolutionary Operation
Concluding Remarks

Appendix: Brief Introduction to R

Answers to Selected Exercises