Devlin's Angle

July-August 2004

The Two Envelopes Paradox

I received a letter recently asking for me to "rule" on a debate two people were having about the notorious two envelopes paradox. Since my efforts to convince people of the correct resolution to the Monty Hall Problem inevitably generate a small avalanche of letters claiming I am completely wrong, I have in the past hesitated to tackle the much, much trickier envelopes puzzle. But the time has come, I think, to throw caution to the wind, and enter the fray.

Here, for those unfamiliar with the problem, is what it says.

You are taking part in a game show. The host offers you two envelopes, each containing a check. You may choose one, keeping the money it contains. She tells you that one envelope contains exactly twice as much as the other, but does not tell you which is which.

Since you have no way of knowing which envelope contains the larger sum, you pick one at random. The host asks you to open the envelope. You do so and take out a check for $40,000.

Here is where things get interesting, especially for contestants who know some mathematics.

The host now says you have a chance to change your mind and choose the other envelope. If you don't know anything about probability theory, particularly expectations, you probably say to yourself, the odds are fifty-fifty that you have chosen the larger sum, so you may as well stick with your first choice. (And you'd be right. But I'll come back to this in a moment.)

On the other hand, if you know a bit (though not too much) about probability theory, you may well try to compute the expected gain due to swapping. The chances are you would argue as follows. The other envelope contains either $20,000 or $80,000, each with probability .5. Hence the expected gain of swapping is

[0.5 x 20,000] + [0.5 x 80,000] - 40,000 = 10,000
That's an expected gain of $10,000. So you swap.

But wait a minute. There's nothing special about the actual monetary amounts here, provided one envelope contains twice as much as the other. Suppose you opened one envelope and found $M. Then you would calculate your expected gain from swapping to be

[0.5 x M/2] + [0.5 x 2M] - M = M/4
and since M/4 is greater than zero you would swap. Right?

Okay, let's take this line of reasoning a bit further. If it doesn't matter what M is, then you don't actually need to open the envelope at all. Whatever is in the envelope you would choose to swap. Still with me?

Well, if you don't open the envelope, then you might as well choose the other envelope in the first place. And having swapped envelopes, you can repeat the same calculation again and again, swapping envelopes back and forward ad-infinitum. There is no limit to the cumulative expected gain you can obtain. But this is absurd.

And there's the paradox. What is wrong with the computation of the expected gain from swapping?

The answer is everything. The above computation is meaningless - which is why it leads so easily to a nonsensical outcome. If you want to apply probability theory, you are free to do so, but you need to do it correctly. And that means working with actual probabilities, taking care to distinguish between prior and posterior probabilities. Let's take a closer look.

As with the Monty Hall Problem, if you really want to analyze the situation, you have to start by looking at the way the scenario was set up.

Let L denote the lower dollar value of the two checks. The other check thus has value 2L. Let P(L) be the prior probability distribution for the choice the host makes for the lower value in the envelopes. (This will affect the entire game. Of course, we don't know anything about this distribution. But we can see how it affects the outcome of the game. Read on.

When you make your choice (C) during the game, you choose either the envelope containing the lower value (C=lower) or the one that contains the higher (C=higher). As the amounts are hidden from you, you choose entirely at random, with equal probabilities for the two options, so

P(C=lower) = P(C=higher) = 0.5
During the game, the value (V) of the content of the chosen envelope is revealed to be a certain value M. Given this information, what is the posterior probability that the chosen envelope contains the higher or lower value? That is, what is P(C|V=M), the probability that you chose the envelope containing the lower/higher value, given you now know what V is? This is the probability you need in order to compute any expected gain. The correct expected gain calculation is:
(2M)P(C=lower|V=M)+(M/2)P(C=higher|V=M) - M
The paradox above arose because you assumed that
P(C=lower|V=M) = P(C=higher|V=M) = 0.5
Let's see why this cannot be the case. (In what follows, remember that L, V, C are variables and M is a numerical constant.)

By Bayes' Theorem:

P(C|V=M) = P(V=M|C)P(C)/P(V=M) . . . (1)
Taking the first of the two cases, where you choose the lower value (V=L), we have
P(V=M|C=lower) = P(L=M)
The second of the two cases, where the chosen envelope contains double the lower value, is
P(V=M|C=higher) = P(L=M/2)
Substituting each of these two identities in to (1) gives
P(C=lower|V=M) = P(C=lower)P(L=M)/P(V=M) . . . (2)
P(C=higher|V=M) = P(C=higher)P(L=M/2)/P(V=M) . . . (3)
From (2), P(C=lower|V=M) is the same as P(C=lower) only if P(L=M) is the same as P(V=M). If it is not then in the calculations of expected gain you have used the incorrect probability. Specifically, you have used the prior probability, P(C=lower)=0.5, of choosing the lower value, rather than the posterior probability P(C=lower|V). The same argument works for P(C=higher|V), starting from (3).

Under what circumstances could we have P(L=M) = P(V=M)?

Since P(V=M) must normalize the distribution, we have

P(V=M) = P(C=lower)P(L=M) + P(C=higher)P(L=M/2)
that is,
P(V=M) = 0.5 P(L=M) + 0.5 P(L=M/2) . . . (4)
From (4), to have P(L=M) = P(V=M) we would need P(L=M/2) = P(L=M) for all values of M. This is an infinite uniform "distribution". But no such distribution exists (it cannot be normalised). Hence it is impossible to have any prior distribution for which P(L=M) = P(L=M/2) is satisfied for all M. As a result there is no posterior distribution for which, given any V, we could have P(C=lower|V) = 0.5.

The result you will get in the game depends on the prior distribution for the amounts in the envelope. For example if the prior distribution P(L) were uniform between 0 and $30,000, and you found $40,000 in the envelope, then you would expect to lose if you swap, whereas if the prior were uniform between $30,000 and $100,000, you would expect to gain.

To summarize: the paradox arises because you use the prior probabilities to calculate the expected gain rather than the posterior probabilities. As we have seen, it is not possible to choose a prior distribution which results in a posterior distribution for which the original argument holds; there simply are no circumstances in which it would be valid to always use probabilities of 0.5.

The solution given follows closely that by Amos Storkey of the University of Edinburgh:

Devlin's Angle is updated at the beginning of each month.
Mathematician Keith Devlin ( [email protected]) is the Executive Director of the Center for the Study of Language and Information at Stanford University and The Math Guy on NPR's Weekend Edition. Devlin's most recent book is Sets, Functions, and Logic: an Introduction to Abstract Mathematics (Third Edition), published by Chapman and Hall in 2003.