You are here

Modern Statistics for Modern Biology

Susan Holmes and Wolfgang Huber
Cambridge University Press
Publication Date: 
Number of Pages: 
[Reviewed by
Robert Hayden
, on

This book is a roughly 9 by 12 by 1 inch paperback.  What most impresses one on picking it up is the weight.  Opening the pages reveals that every page is printed on glossy stock, and most pages include multiple images, many in color.  Biologists may note the sophisticated statistical graphics and purchase the book solely to learn how to add such impressive graphics to their next publication.

The cover blurb and front matter say little about the intended audience except that the book is "for biologists".  Author Holmes uses the book as a text for a course she offers at Stanford where the only listed prerequisite is "minimal familiarity with computers."  It seems unlikely that daily visits to Facebook would suffice.  For example, the book contains a passage explaining how the statistical computing environment R handles loops, assuming the reader is already familiar with loops in other programming languages.  Continuing to try to identify prerequisites from the text itself, it appears that calculus and a solid first course in statistics would be very useful -- perhaps the statistics course often called "calculus-based."  In several places the reader is directed to Rice's text on mathematical statistics for details so perhaps the prerequisite could be characterized as the ability to benefit from such a reference.

The reader will also need to be familiar with a great deal of modern biology.  In particular, this book deals with statistical techniques useful in areas such as genomics where biologists accumulate vast quantities of (usually) observational data.  For that reason, the methods given are often those of data science rather than the sort of statistics normally found in undergraduate statistics courses.  Thus it fills a significant gap in the training of biological researchers working in those areas of biology.  Many other biologists will not find it useful in their work, and teachers of statistics looking for interesting applications will find that most of the examples require a strong background in the areas of biology covered.  Putting all the pieces together, this would seem to be a valuable reference for biologists who do work in those areas.

The presentation is reminiscent of some of the SIAM case study volumes in that there is no systematic presentation of the statistics; instead, there is a succession of real problems analyzed as examples.  Coverage of the underlying mathematics and statistics is often rather sketchy, but many references are given to textbooks and research papers.  Graph theory and Markov chains are two topics addressed that are likely to be already familiar to mathematicians.  The vast datasets and complex analyses require the use of computers, and the authors have chosen to use R and the Bioconductor R packages. They do not teacher R, though, and mostly just reference the rather intimidating documentation.  The "Dummies" book on R by de Vries and Meys would appear to be a good match for those wishing a gentler approach.

It would seem that this book will appeal to a narrow audience, including a narrow segment of MAA members. For its audience, though, it should prove highly valuable.  Author Holmes was trained as a statistician.  Huber was trained as a physicist, but shifted to computational biology early in his career.  Both have numerous publications. Their book has a ten-page reference list and a four-page index.  There is also a website with all the data used in examples and all of the R code -- including the code for all the graphics created with R. That alone should be worth the price of the book to many biologists.

After a few years in industry, Robert W. Hayden ( taught mathematics at colleges and universities for 32 years and statistics for 20 years. In 2005 he retired from full-time classroom work.