This text is aimed at upper-level undergraduate students who:

- need to learn how to use statistics and want to understand what the methods actually do and not just follow a recipe;
- do not want to become mathematical statisticians, and so do not need all the gory details of every corner case, epsilon, and counterexample; and
- want to actually
*do* statistics with data, using R.

The author refers to the book’s approach as being that of “probability for statistics” rather than “probability, then statistics”, the latter being typical of many undergrad stats texts. For the target audience, I really believe this is the right approach. As an example, Chapter 2 on Probability and Random Variables has all the standard material, but also a section on Hypothesis Tests and p-Values. There is no sense waiting until half the book is done to get to this, as would be done in the “probability, then stats” approach of most books.

The book commendably uses R in an integrated manner throughout, and uses packages from the tidyverse in a simple way. A student wanting to go further in the field will need more on this front, for example Grolemund and Wickham *R for Data Science*. There is a facility for loading code blocks from the text into R via a “snippet” function, as in **snippet("data01")**. This is nice, but some code blocks assume that prior blocks have already been executed and will not work as expected if this is not true. Hence you may run into situations where you are block chasing, it would be better if that were not the case and each code block was an independent entity.

In section 1.2.1 the book has a very nice explanation of the standard formula interface in R as a template to solve a problem, via** goal( formula , data = mydata, ... ) **(viewable at the AMS materials for the book on page 10). I think that I will be pointing many R beginners to this simple one page explanation, and there are many other such pages throughout the text. The author has made a huge effort to make the ideas throughout this text as clear as possible to the student.

The book makes heavy use of **ggformula**, which is an R package that provides a simplified formula interface to **ggplot2** the gold standard of plotting in R). This is perhaps a good decision for R beginners, but to go further students will want to consult other resources (again, Grolemund and Wickham is an excellent place to start).

There are *lots* of exercises at varying levels of difficulty, and appendices with solutions to some, an introduction to R, some basic mathematical preliminaries, and linear algebra.

This is an excellent text for the target audience, and at over 800 pages as a bonus students using it will increase their muscle mass by by carrying it around, as well as their knowledge of statistics by working through it.

Buy Now

Peter Rabinovitch is a Senior Performance Engineer at Akamai who has been doing data science since long before “data science” was a thing.