Devlin's Angle

February 2000

The legacy of the Reverend Bayes

How do you use inconclusive evidence to assess the probability that a certain event will occur? One method that has become increasingly popular in recent years depends on a mathematical theorem proved by an 18th Century English Presbyterian minister by the name of Thomas Bayes. Curiously, Bayes' theorem languished largely ignored and unused for over two centuries before statisticians, lawyers, medical researchers, software developers, and others started to use it in earnest during the 1990s.

What makes this relatively new technique of "Bayesian inference" particularly intriguing is that it uses an honest-to-goodness mathematical formula (Bayes' Theorem) in order to improve -- on the basis of evidence -- the best (human) estimate that a particular event will take place. In the words of some statisticians, it's "mathematics on top of common sense." You start with an initial estimate of the probability that the event will occur and an estimate of the reliability of the evidence. The method then tells you how to combine those two figures -- in a precise, mathematical way -- to give a new estimate of the event's probability in the light of the evidence. In some highly constrained situations, both initial estimates may be entirely accurate, and in such cases Bayes' method will give you the correct answer. In a more typical real-life situation, you don't have exact figures, but as long as the initial estimates are reasonably good, then the method will give you a better estimate of the probability that the event of interest will occur. Thus, in the hands of an expert in the domain under consideration, someone who is able to assess all the available evidence reliably, Bayes' method can be a powerful tool.

For example, suppose that you undergo a medical test for a relatively rare cancer. Your doctor tells you that, according to surveys by medical statisticians, the cancer has an incidence of 1% among the general population. Thus, before you take the test, and in the absence of any other evidence, your best estimate of your likelihood of having the cancer is 1 in 100, i.e. a probability of 0.01. Then you take the test. Extensive trials have shown that the reliability of the test is 79%. More precisely, although the test does not fail to detect the cancer when it is present, it gives a positive result in 21% of the cases where no cancer is present -- what is known as a "false positive." When you are tested, the test produces a positive diagnosis. The question is: Given the result of the test, what is the probability that you have the cancer?

Most people assume that if the test has a reliability rate of nearly 80%, and they test positive, then the likelihood that they have the cancer is about 80% (i.e., the probability is approximately 0.8). But they are way off. Given the scenario just described, the likelihood that they have the cancer is a mere 4.6% (i.e., the probability is 0.046). Still a worrying possibility, but hardly that scary 80%. The problem is, that (scary) 80% reliability figure for the test has to be balanced against the (more reassuring) low (1%) incidence rate of the cancer in the general population. Using Bayes' method ensures you make proper use of all the evidence to hand.

In general, Bayes' method shows you to calculate the probability of a certain event E (in the above example, having the cancer), based on evidence (e.g. the result of the medical test), when you know (or can estimate):

(1) the probability of E in the absence of any evidence;

(2) the evidence for E;

(3) the reliability of the evidence (i.e., the probability that the evidence is correct).

In the cancer example, the probability in (1) is 0.01, the evidence in (2) is that the test came out positive, and the probability in (3) has to be computed from the 79% figure given. All three pieces of information are highly relevant, and to evaluate the probability that you have the cancer you have to combine them in the right manner. Bayes' method tells you how to do this. Here's how.

To keep the arithmetic simple, let's assume a total population of 10,000 people. Since all we are ultimately concerned about is percentages, this simplification will not affect the final answer. Let's assume in addition that the various probabilities are reflected exactly in the actual numbers. Thus, of the total population of 10,000, 100 will have the cancer, 9,900 will not.

Bayes' method is about improving an initial estimate after you have obtained new evidence. In the absence of the test, all you could say about the likelihood of you having the cancer is that there is a 1% chance that you do. Then you take the test, and it shows positive. How do you revise the probability that you have the cancer?

Well, there are 100 individuals in the population who do have the cancer, and for all of them the test will correctly give a positive prediction, thereby identifying 100 individuals as having the cancer.

Turning to the 9,900 cancer-free individuals, for 21% of them the test will incorrectly give a positive result, thereby identifying 9900 x 0.21 = 2079 individuals as having the cancer.

Thus, in all, the test identifies a total of 100 + 2079 = 2179 individuals as having the cancer. Having tested positive, you are among that group. (This is precisely what the test evidence tells you.) The question is, are you in the subgroup that really does have the cancer or is your test result a false positive?

Of the 2179 identified by the test, 100 really do have the cancer. Thus, the probability of you being among that group is 100/2179 = 0.046. In other words, there is a 4.6% possibility that you have the cancer.

The above computation shows why it is important to take account of the overall incidence of the cancer in the population -- what is sometimes referred to as the base rate or the prior probability. In a population of 10,000, with a cancer having an incidence of 1%, a test with a reliability of 79% (i.e., 21% false positives) will produce 2,079 false positives. This far outweighs the number of actual cancer cases, which is 100. As a result, when your test result comes back positive, the chances are overwhelmingly that you are in the false positive group.

To avoid having to go through the same kind of reasoning every time, Bayes codified the method into a single formula -- Bayes' theorem. Let P(H) be the numerical probability that the hypothesis H is correct in the absence of any evidence--the prior probability. In the above example, H is the hypothesis that you have the cancer and P(H) is 0.01 (1%). You then take the test and obtain a positive outcome; this is the evidence E. Let P(H|E) be the probability that H is correct given the evidence E. This is the revised estimate you want to calculate. Let P(E|H) be the probability that E would be found if indeed H occurred. In the example, the test always detects cancer when it is present so (unusually) P(E|H) = 1 in this case. To compute the new estimate, you first have to calculate P(H-wrong), the probability that H does not occur, which is 0.99 in our example. And you have to calculate P(E|H-wrong), the probability that the evidence E would be found (i.e., the test comes out positive) even though H did not occur (i.e., you do not have the cancer), which is 0.21 in the example. Bayes' theorem says that:

P(H|E) = P(H) x P(E|H)/[P(H) x P(E|H) + P(H-wrong) x P(E|H-wrong)]

Using the formula for our example:

P(H|E) = 0.01 x 1/[0.01 x 1 + 0.99 x 0.21] = 0.046

A quantity such as P(H|E) is known as a conditional probability--the conditional probability of H occurring, given the evidence E. Unscrupulous lawyers have been known to take advantage of the lack of mathematical sophistication among judges and juries by deliberately confusing the two conditional probabilities P(G|E), the probability that the defendant is guilty given the evidence, and P(E|G), the conditional probability that the evidence would be found assuming the defendant were guilty. Deliberate misuse of probabilities has been known to occur where scientific evidence such as DNA testing is involved, such as paternity suits and rape and murder cases. In such cases, prosecuting attorneys may provide the court with a figure for P(E), the probability that the evidence could be found among the general population, whereas the figure of relevance in deciding guilt is P(G|E). As Bayes' formula shows, the two values can be very different, with P(G|E) generally much lower than P(E). Unless there is other evidence that puts the defendant into the group of possible suspects, such use of P(E) is highly suspect, and indeed should perhaps be prohibited. The reason is that, as with the cancer test example, it ignores the initial low prior probability that a person chosen at random is guilty of the crime in question.

Instructing the court in the proper use of Bayesian inference was the winning strategy used by American long-distance runner Mary Slaney's lawyers when they succeeded in having her 1996 performance ban overturned. Slaney failed a routine test for performance-enhancing steroids at the 1996 Olympic Games, resulting in the US athletic authorities banning her from future competitions. Her lawyers demonstrated that the test did not take proper account of the prior probability and thus made a tacit initial assumption of guilt.

In addition to its use -- or misuse -- in court cases, Bayesian inference methods lie behind a number of new products on the market. For example, the paperclip advisor that pops up on the screen of users of Microsoft Office -- the system monitors the user's actions and uses Bayesian inference to predict likely future actions and provide appropriate advice accordingly. For another example, chemists can take advantage of a software system that uses Bayesian methods to improve the resolution of nuclear magnetic resonance (NMR) spectrum data. Chemists use such data to work out the molecular structure of substances they wish to analyze. The system uses Bayes' formula to combine the new data from the NMR device with existing NMR data, a procedure that can improve the resolution of the data by several orders of magnitude.

Other recent uses of Bayesian inference are in the evaluation of new drugs and medical treatments, the analysis of human DNA to identify particular genes, and in analyzing police arrest data to see if any officers have been targeting one particular ethnic group.

Devlin's Angle is updated at the beginning of each month.
Keith Devlin ( [email protected]) is Dean of Science at Saint Mary's College of California, in Moraga, California, and a Senior Researcher at Stanford University. His latest book InfoSense: Turning Information Into Knowledge, which shows how a mathematical approach to information can help us to understand information flow and manage it more efficiently, was published by W. H. Freeman last August.