You are here

Introductory Statistics

Sheldon M. Ross
Publisher: 
Academic Press
Publication Date: 
2017
Number of Pages: 
828
Format: 
Hardcover
Edition: 
4
Price: 
125.00
ISBN: 
9780128043172
Category: 
Textbook
[Reviewed by
Peter Rabinovitch
, on
08/4/2017
]

I am a huge fan of Sheldon Ross’s books on probability models and stochastic processes, so I was very interested in reviewing his Introductory Statistics. It is of the “trust me” genre, i.e., there are no proofs and few derivations. No calculus is used, and there are no simulations to attempt to convince the reader of the validity of the methods (i.e. nothing like what is in Introduction to Statistical Thinking (With R, Without Calculus) by Benjamin Yakir (the link takes you to a free pdf of that book).

Introductory Statistics is designed for a first service course for non-science majors and has all the standard topics you would expect and a few extras, but unfortunately it stops short. For example, it very clearly explains what an ANOVA is and how to do it, but unfortunately it stops there. The student will know how to test to see if several means are different, but not know what to do next if in fact they are. I guess tough decisions had to be made in order to keep the book to its already large size of about 800 pages.

I question some of the choices made in the production (as opposed to writing) of the text. There are (the much maligned) pie charts and (horrific) 3d bar charts throughout, and they are not being shown as examples of what not to do. There are sections in small font that are in some cases the most important parts (such as the central limit theorem), but then there are historical asides in the same format. In the few very simple R code samples, symbols that are not easily reproduced on the keyboard are used. For example, how many people, especially in the target audience, would know that to get \(\mu\) on the keyboard you need to use alt-230? And \(\surd\) is used in the code too, which R does not accept. The output of the R commands is formatted differently than it actually looks when you run it. None of this is a big deal for somebody already comfortable with the material, but can cause a lot of time to be wasted for the beginners who will be using this book, as will the various typos.

The book is supposed to come with a cd of material that was not supplied with the review copy — I hope that the datsasets used in the book are on the disc so that students do not have to type everything in.

There are a gazillion exercises, many pages of appendices and statistical tables at the back of the book, along with answers to odd numbered exercises — except for the last chapter, which brings me to another negative point. The last chapter is entitled “Machine Learning and Big Data”, but there is nothing about big data in the chapter and so it seems to be a marketing decision to call it that.

There are some interesting topics included that are not in most introductory stats texts, such as the Gini index, bandit problems, and quality control. But there is no discussion of Bayesian methods, nor of the issues arising from NHST (null hypothesis significance testing), nor multiple testing issues.

In summary, Ross’s writing (as always) is very clear. But there are too many issues with this text to recommend it.


Peter Rabinovitch is a Senior Performance Engineer at Akamai who has been doing data science since long before “data science” was a thing.

Chapter 1: Introduction to Statistics

 

  • Abstract
  • 1.1. Introduction
  • 1.2. The Nature of Statistics
  • 1.3. Populations and Samples
  • 1.4. A Brief History of Statistics
  • Key Terms
  • The Changing Definition of Statistics
  • Review Problems

Chapter 2: Describing Data Sets

  • Abstract
  • 2.1. Introduction
  • 2.2. Frequency Tables and Graphs
  • Problems
  • 2.3. Grouped Data and Histograms
  • Problems
  • 2.4. Stem-and-Leaf Plots
  • Problems
  • 2.5. Sets of Paired Data
  • Problems
  • 2.6. Some Historical Comments
  • Key Terms
  • Summary
  • Review Problems

Chapter 3: Using Statistics to Summarize Data Sets

  • Abstract
  • 3.1. Introduction
  • 3.2. Sample Mean
  • Problems
  • 3.3. Sample Median
  • Problems
  • Problems
  • 3.4. Sample Mode
  • Problems
  • 3.5. Sample Variance and Sample Standard Deviation
  • Problems
  • 3.6. Normal Data Sets and the Empirical Rule
  • Problems
  • 3.7. Sample Correlation Coefficient
  • Problems
  • 3.8. The Lorenz Curve and Gini Index
  • Problems
  • 3.9. Using R
  • Key Terms
  • Summary
  • Review Problems

Chapter 4: Probability

  • Abstract
  • 4.1. Introduction
  • 4.2. Sample Space and Events of an Experiment
  • Problems
  • 4.3. Properties of Probability
  • Problems
  • 4.4. Experiments Having Equally Likely Outcomes
  • Problems
  • 4.5. Conditional Probability and Independence
  • Problems
  • 4.6. Bayes' Theorem
  • Problems
  • 4.7. Counting Principles
  • Problems
  • Key Terms
  • Summary
  • Review Problems

Chapter 5: Discrete Random Variables

  • Abstract
  • 5.1. Introduction
  • 5.2. Random Variables
  • Problems
  • 5.3. Expected Value
  • Problems
  • 5.4. Variance of Random Variables
  • Problems
  • 5.5. Jointly Distributed Random Variables
  • Problems
  • 5.6. Binomial Random Variables
  • Problems
  • 5.7. Hypergeometric Random Variables
  • Problems
  • 5.8. Poisson Random Variables
  • Problems
  • 5.9. Using R to calculate Binomial and Poisson Probabilities
  • Key Terms
  • Summary
  • Review Problems

Chapter 6: Normal Random Variables

  • Abstract
  • 6.1. Introduction
  • 6.2. Continuous Random Variables
  • Problems
  • 6.3. Normal Random Variables
  • Problems
  • 6.4. Probabilities Associated with a Standard Normal Random Variable
  • Problems
  • 6.5. Finding Normal Probabilities: Conversion to the Standard Normal
  • 6.6. Additive Property of Normal Random Variables
  • Problems
  • 6.7. Percentiles of Normal Random Variables
  • Problems
  • 6.8. Calculating Normal Probabilities with R
  • Key Terms
  • Summary
  • Review Problems

Chapter 7: Distributions of Sampling Statistics

  • Abstract
  • 7.1. A Preview
  • 7.2. Introduction
  • 7.3. Sample Mean
  • Problems
  • 7.4. Central Limit Theorem
  • Problems
  • 7.5. Sampling Proportions from a Finite Population
  • Problems
  • 7.6. Distribution of the Sample Variance of a Normal Population
  • Problems
  • Key Terms
  • Summary
  • Review Problems

Chapter 8: Estimation

  • Abstract
  • 8.1. Introduction
  • 8.2. Point Estimator of a Population Mean
  • Problems
  • 8.3. Point Estimator of a Population Proportion
  • Problems
  • Problems
  • 8.4. Estimating a Population Variance
  • Problems
  • 8.5. Interval Estimators of the Mean ofa Normal Population with Known Population Variance
  • Problems
  • 8.6. Interval Estimators of the Mean ofa Normal Population with Unknown Population Variance
  • Problems
  • 8.7. Interval Estimators of a Population Proportion
  • Problems
  • 8.8. Use of R
  • Key Terms
  • Summary
  • Review Problems

Chapter 9: Testing Statistical Hypotheses

  • Abstract
  • 9.1. Introduction
  • 9.2. Hypothesis Tests and Significance Levels
  • Problems
  • 9.3. Tests Concerning the Mean of a Normal Population: Case of Known Variance
  • Problems
  • Problems
  • 9.4. The t Test for the Mean of a Normal Population: Case of Unknown Variance
  • Problems
  • 9.5. Hypothesis Tests Concerning Population Proportions
  • Problems
  • 9.6. Use of R in Running a One Sample t-test
  • Key Terms
  • Summary
  • Review Problems and Proposed Case Studies

Chapter 10: Hypothesis Tests Concerning Two Populations

  • Abstract
  • 10.1. Introduction
  • 10.2. Testing Equality of Means of Two Normal Populations: Case of Known Variances
  • Problems
  • 10.3. Testing Equality of Means: Unknown Variances and Large Sample Sizes
  • Problems
  • 10.4. Testing Equality of Means: Small-Sample Tests when the Unknown Population Variances Are Equal
  • Problems
  • 10.5. Paired-Sample t Test
  • Problems
  • 10.6. Testing Equality of Population Proportions
  • Problems
  • 10.7. Use of R in Running a Two Sample t-Test
  • Key Terms
  • Summary
  • Review Problems

Chapter 11: Analysis of Variance

  • Abstract
  • 11.1. Introduction
  • 11.2. One-Factor Analysis of Variance
  • Problems
  • 11.3. Two-Factor Analysis of Variance: Introduction and Parameter Estimation
  • Problems
  • 11.4. Two-Factor Analysis of Variance: Testing Hypotheses
  • Problems
  • 11.5. Final Comments
  • Key Terms
  • Summary
  • Review Problems

Chapter 12: Linear Regression

  • Abstract
  • 12.1. Introduction
  • 12.2. Simple Linear Regression Model
  • Problems
  • 12.3. Estimating the Regression Parameters
  • Problems
  • 12.4. Error Random Variable
  • Problems
  • 12.5. Testing the Hypothesis that β=0
  • Problems
  • 12.6. Regression to the Mean
  • Problems
  • 12.7. Prediction Intervals for Future Responses
  • Problems
  • 12.8. Coefficient of Determination
  • Problems
  • 12.9. Sample Correlation Coefficient
  • Problems
  • 12.10. Analysis of Residuals: Assessingthe Model
  • Problems
  • 12.11. Multiple Linear Regression Model
  • Problems
  • 12.12. Logistic Regression
  • 12.13. Use of R in Regression
  • Key Terms
  • Summary
  • Review Problems

Chapter 13: Chi-Squared Goodness-of-Fit Tests

  • Abstract
  • 13.1. Introduction
  • 13.2. Chi-Squared Goodness-of-Fit Tests
  • Problems
  • 13.3. Testing for Independence in Populations Classified According to Two Characteristics
  • Problems
  • 13.4. Testing for Independence in Contingency Tables with Fixed Marginal Totals
  • Problems
  • 13.5. Use of R
  • Key Terms
  • Summary
  • Review Problems

Chapter 14: Nonparametric Hypotheses Tests

  • Abstract
  • 14.1. Introduction
  • 14.2. Sign Test
  • Problems
  • 14.3. Signed-Rank Test
  • Problems
  • 14.4. Rank-Sum Test for Comparing Two Populations
  • Problems
  • 14.5. Runs Test for Randomness
  • Problems
  • 14.6. Testing the Equality of Multiple Probability Distributions
  • Problems
  • 14.7. Permutation Tests
  • Problems
  • Key Terms
  • Summary
  • Review Problems

Chapter 15: Quality Control

  • Abstract
  • 15.1. Introduction
  • 15.2. The X‾ Control Chart for Detecting a Shift in the Mean
  • Problems
  • Problems
  • 15.3. Control Charts for Fraction Defective
  • Problems
  • 15.4. Exponentially Weighted Moving-Average Control Charts
  • Problems
  • 15.5. Cumulative-Sum Control Charts
  • Problems
  • Key Terms
  • Summary
  • Review Problems

Chapter 16: Machine Learning and Big Data

  • Abstract
  • 16.1. Introduction
  • 16.2. Late Flight Probabilities
  • 16.3. The Naive Bayes Approach
  • Problems
  • 16.4. Distance Based Estimators the k-Nearest Neighbors Rule
  • Problems
  • 16.5. Assessing the Approaches
  • Problems
  • 16.6. Choosing the Best Probability: A Bandit Problem
  • Problems

Appendix A: A Data Set

Appendix B: Mathematical Preliminaries

  • B.1. Summation
  • B.2. Absolute Value
  • B.3. Set Notation

Appendix C: How to Choose a Random Sample

Appendix D: Tables

Appendix E: Programs

Answers to Odd-Numbered Problems

  • Chapter 1 Problems
  • Chapter 2 Review
  • Chapter 3 Review
  • Chapter 4 Review
  • Chapter 5 Review
  • Chapter 6 Review
  • Chapter 7 Review
  • Chapter 8 Review
  • Chapter 9 Review
  • Chapter 10 Review
  • Chapter 11 Review
  • Chapter 12 Review
  • Chapter 13 Review
  • Chapter 14 Review
  • Chapter 15 Review

Index

Introductory Statistics

  • 1. Introduction to Statistics
  • 2. Describing Data Sets
  • 3. Using Statistics to Summarize Data Sets
  • 4. Probability
  • 5. Discrete Random Variables
  • 6. Normal Random Variables
  • 7. Distributions of Sampling Statistics
  • 8. Estimation
  • 9. Testing Statistical Hypotheses
  • 10. Hypotheses Tests Concerning Two Populations
  • 11. Analysis of Variance
  • 12. Linear Regression
  • 13. Chi-squared Goodness of Fit Tests
  • 14. Nonparametric Hypotheses
  • 15. Quality Control
 
Tags: