This is an introductory text for a course on the fundamentals of probability and information theory. It was originally designed for an undergraduate course on mathematical methods in information theory under the assumption that students taking such a course often have a shaky background in traditional mathematics.

The approach the author takes is unusual, especially given his target audience: he presents probability as a measure on Boolean algebras. While the author views σ-algebras as too sophisticated for this level, he evidently believes that the benefits of a measure-theoretic approach (without σ-additivity) outweigh the cost of an additional level of abstraction.

The first few chapters begin gently with combinatorics and some basic set theory. The concept of probability and the basic properties of a probability measure first appear in Chapter 4. The next chapter then takes up discrete random variables and the notions of expectation, variance, covariance, and correlation. Here the student first sees discrete probability distributions: binomial, Poisson, geometric, negative binomial and hypergeometric. Overall, the introduction to probability is completed in about fifty pages.

The key concepts of information and entropy and their relation to communications are the subjects of the next two chapters. The information content of an event and its entropy are defined. Then the author goes on to define and describe joint and conditional entropies. With these concepts in place he begins to apply these ideas to communication channels. A very simple model — the binary symmetric channel — is used as a vehicle for motivating and illustrating the basic concepts.

Next the author discusses continuous distributions and extends the notions of information and entropy accordingly. He also introduces random vectors, marginal distributions and mutual information. The second edition of this book has a new chapter on Markov processes and their entropy. This brings time into what has heretofore been a static picture and gives the student a first taste of stochastic processes.

The book still bears signs of having begun as lecture notes; some of the material looks a bit like an incompletely filled-in outline. The early chapter on combinatorics is odd because it is thin on detail and short on examples. Then — at the end — he introduces the gamma function pretty much out of nowhere. Many of the sections need more examples — in particular, more concrete examples — especially for the stated target audience for the book. The best sections are those that deal with information and entropy as applied to communications; here the pieces come together nicely.

Prerequisites include reasonable skill with algebraic manipulation and a standard first year calculus course. There are a few places in later chapters where more is needed (double integrals, partial derivatives for a section on Lagrange multipliers, a bit of matrix algebra); the author does provide some background material in appendices.

Bill Satzer ([email protected]) is a senior intellectual property scientist at 3M Company, having previously been a lab manager at 3M for composites and electromagnetic materials. His training is in dynamical systems and particularly celestial mechanics; his current interests are broadly in applied mathematics and the teaching of mathematics.