Thomas P. Ryan
Publisher: John Wiley (2008)
Details: 642 pages, Hardcover
Edition: 2 Series: Wiley Series in Probability and Statistics
Price: $115.00
ISBN: 9780470081860
Category: Textbook
Topics: Regression Analysis, Statistics
[Reviewed by Martha K. Smith, on 04/03/2009]
I have taught regression analysis eleven times, and have never found a book I completely like [1]. At first I thought the book under review might be the one I had long hoped for, but I was disappointed. It has many strong points — but some weak ones as well, particularly for an undergraduate course.
Since this review is for a mathematics audience rather than a statistical one, I’ll start with my standard explanation-for-mathematicians of the difference between pure mathematics and (frequentist [2]) statistics. Here goes:
In pure mathematics, we are concerned with statements of the form “A implies B”. Our main goal is to prove them. We sometimes also want to discover them, and, of course, we use statements that are already proved to prove new ones. In statistics, many statements of this form are also relevant, but they are not as big a part of the process of doing statistics as they are of doing pure mathematics. The A parts of those statements that are useful in statistics typically list model assumptions — things like, “errors are normally distributed with the same variance for each group”. Doing statistics well requires asking (and answering, usually not definitively, but as best we can with the information available) questions of the following sorts:
Unfortunately, statistics is often not done well. Many users of statistics frequently ignore (typically because of ignorance of their importance) the above questions. The book under review squarely addresses these important questions from the start. In this respect, it is better than any other regression textbook I have seen. In addition, the author provides summaries of what techniques are available in the various statistical software packages for regression, provides a good collection of exercises illustrating points that are often misunderstood, and provides copious references to details not included in the book. For these reasons alone, I strongly recommend the book as a reference for anyone teaching or using regression.
So what are its weaknesses? Why would I not use it as a textbook? A major weakness of the book as a textbook is that, as it progresses through the chapters that the author suggests for an undergraduate course, it sounds more like a guide to the literature than an introductory textbook. This is appropriate for its title, but not for an undergraduate textbook (and not even for a textbook for a master’s level course).
I am also concerned that the author frequently says “obvious” or “obviously.” This is not a petty complaint, but a serious concern for an undergraduate textbook. First, “obvious” is a subjective term; what may seem obvious to the author may not seem obvious to the student. So using “obvious” is not good human relations when teaching. Even more serious is that students all too often think something is obvious when it warrants extensive justification — or even when it is not true. Indeed, the author often appropriately use phrases such as, “does not have the meaning that would seem to be self-evident” (p. 3) when such situations might arise. However, my experience is that when the instructor or text-book author uses “obvious,” students tend to pick up the phrase and use it when something is false or needs substantial explanation.
Thus my recommendation is: If you teach regression, be sure to buy this book, and read at least the first two chapters carefully. Use what you learn from the book to improve, when needed, on whatever textbook you use, and to motivate yourself to emphasize to your students that there is more about regression than can be covered in an undergraduate course. Use the software summaries to help you decide what software to use for your class. Keep the book handy to refer to when you encounter the inevitable limitations of your software, have other questions, or need a good additional example or exercise.
Even if you only teach simple regression as part of an overview class, being familiar with the material in the first two chapters of this book should help you be aware of the cautions that are required in using and interpreting even simple regression. You will also find the book helpful when colleagues from other departments come to you with questions (as will undoubtedly happen if you are a mathematician teaching regression, since if there were enough statisticians at your school, they would be teaching the course).
If students are likely to do independent research projects in statistics at your school, or if your school has a graduate program in any field that uses regression, be sure your library has a copy, too.
Although I don’t recommend the book as an undergraduate textbook, I think the author has done an admirable job of pointing out most of the intricacies of regression, providing a guide to regression methods going beyond the standard ones, reviewing available software (and providing regression macros for Minitab on the ftp site accompanying the book), and providing a wide variety of instructive exercises and examples.
Notes:
[1] I finally settled on the third textbook I tried, Cook and Weisberg’s Applied Regression Including Computing and Graphics, Wiley, 1999. However, I have reworked a lot of the exposition and added some other things appropriate to the particular course I teach, resulting over the years in lecture notes that are available at http://www.ma.utexas.edu/users/mks/384Gfa08/384G08syl.html. (The course I have taught is primarily a master’s level course, but an undergraduate course with slightly different assignments and exams meets with it.) The software (called arc) developed to accompany Cook and Weisberg’s text has many features that lend it well to use for teaching. Unfortunately, the software has not been supported, so that the Macintosh, and in some cases the Unix, versions are no longer readily usable. The Windows version is still viable, however.
[2] Bayesian statistics would involve a slightly more complicated explanation; this review concerns only frequentist statistics.
Martha Smith is a soon-to-be-retired Professor of Mathematics at the University of Texas at Austin. Visit her home page at http://www.ma.utexas.edu/users/mks/ for contact information and links to a variety of things. In a few months, there should be a link to a new website on Common Misteaks in Statistics.