This book deals with the application of statistical methods, similar to those taught in a first course, to the problem of assessing the reliability of scientific measurements. This is a problem of great interest to physical scientists, a few statisticians, and very few mathematicians. Hence this review will focus more on the perhaps unfamiliar issues discussed in the book than on the particular (and quite complex) solution offered.

Imagine, then, that a laboratory reports a value for the specific gravity of osmium. Perhaps they made 10 measurements and averaged them together. So far nothing has happened that would require anyone to take a statistics course. But suppose also there is reported a “standard error” for that value, or a confidence interval. Now we are into that first statistics course. The conceptual model is that the 10 measurements are a random sample from the hypothetical population of all possible measurements. In truth, they are at best a convenience sample, and subject to the problems inherent in any convenience sample.

Of particular interest is the possibility that the numbers generated in a lab in Cleveland on 23 November may not be representative of the results obtained in another lab at another time. One problem may be a systematic bias — perhaps results from this lab in truth run about 1% low. Another problem is that measurements in this lab are more likely to resemble one another than the results from other labs at other times. Putting these together, the reported value for osmium could be off on one direction, and the reported error estimate could be seriously optimistic (and also not account for the systematic error).

To deal with these issues, the author delves into another issue: whether probabilities should be limited to statements about limiting values of indefinite repetitions of events (the frequentist position), or whether it may also apply to degrees of belief (the Bayesian position). The author tries to find a middle ground between these positions, and this may be of interest to those already interested in that issue.

For most MAA members, this book is likely to be much like those optimistic confidence intervals: too narrow and too far from central interests. Potential readers should be aware that simpler approaches exist, and which approach is most satisfying is quite subjective. On the measurements issue, some labs report only the variability of their own instruments. leaving the ultimate question of accuracy unanswered. Unsatisfying, but perhaps realistic.

On subjective probabilities, a seminal paper by John Tukey (“Conclusions vs. Decisions”, *Technometrics*, Vol. 2, No.4 (Nov., 1960), pp. 423–433) suggests that their use might depend on the goal. He distinguishes situations in which data are gathered for the purpose of making an immediate decision, and those in which the goal is to accumulate scientific evidence. In the first of these, subjective notions are expected to be part of the mix, while for the latter most would prefer to avoid them. This book tries to combine the two. Readers will have to decide for themselves how well the solution offered meets their needs.

We have here a serious book about a serious issue, which can be recommended to those with a very serious interest in its subject.

After a few years in industry, Robert W. Hayden (bob@statland.org) taught mathematics at colleges and universities for 32 years and statistics for 20 years. In 2005 he retired from full-time classroom work. He now teaches statistics online at statistics.com and does summer workshops for high school teachers of Advanced Placement Statistics. He contributed the chapter on evaluating introductory statistics textbooks to the MAA’s Teaching Statistics.