April 2000

# The Law of Small Errors

As any author of a book or a long article will attest, no matter how carefully you proofread your work, the moment the published version lands in your hands and you self-admiringly flick through the pages, you'll find an error. Within minutes of picking the work up, your elation at seeing the results of your labors for the first time will be dashed, when you spot that glaring error that not only managed to creep in at some stage, but somehow lay unnoticed through several careful readings. Lay unnoticed until now, that is. Only when it's too late to correct does it leap out of the page. "Gotcha!" it cries gleefully.

It's the Law of Small Errors: As the number of words in a manuscript tends to infinity, the probability of a significant error making it into the final version approaches 1. As with any self-respecting infinite sequence in an undergraduate calculus class, "approaching infinity" here means "a tractable bunch" -- the 100,000 words of the average book is definitely big enough.

I've written 23 books and several dozen long research papers in my career, and the same thing has happened every time. So I suppose I should have been prepared for it to happen again, when the first copy of my new book The Maths Gene arrived on my desk earlier this week. (This is the British edition. The American translation -- The Math Gene (no "s") -- won't come out until August. Publishing is like that. Don't ask why. As the theatrical producer kept remarking in the movie Shakespeare in Love, it's a mystery.) But, experienced as I was, that first glaring slip still caught me off guard. As I was idly flicking through the book for the very first time, the following passage leapt out at me and grabbed me by the throat:

A classic example is the birthday problem, which asks how many people you need to have at a party so that there is a better-than-even chance that two of them will share the same birthday. Most people think the answer is 183, the smallest whole number larger than 365/2. In fact, you need just 23. The answer 183 is the correct answer to a very different question: How many people do you need to have at a party so that there is a better-than-even chance that one of them will share your birthday? If there is no restriction on which two people will share a birthday, it makes an enormous difference.

Now the birthday problem is one of my favorite mathematical examples. I must have written and talked about it dozens of times. I could probably give a short lecture on it in my sleep. So how come that above passage slipped into print? For it's quite wrong. Not the 23 part. Surprising though that number is, you really do need just 23 people at the party to have a better than 0.5 probability that two people will share the same birthday. But the number of people you need to have present for there to be a better-than-evens chance of someone sharing your birthday is not 183, but the much larger 254. (Yes, really, 254, including yourself.)

How did such a howler find its way into the text and go unnoticed? Before I try to answer that, here are the relevant computations to establish those two answers of 23 and 254.

First, the coincidence of two birthdays. It turns out to be easier to compute the probability that no two people at the party have the same birthday, and then subtract the answer from 1 to obtain the probability that two people will share a birthday. For simplicity, let's ignore leap years. Thus, there are 365 possible birthdays to consider.

Imagine the people entering the room one-by-one. When the second person enters the room, there are 364 possible days for her to have a birthday that differs from the first person. So the probability that she will have a different birthday from the first person is 364/365. When the third person enters, there are 363 possibilities of him having a birthday different from both of the first two, so the probability that all three will have different birthdays is 364/365 x 363/365. When the fourth person enters, the probability of all four having different birthdays is 364/365 x 363/365 x 362/365. Continuing in this way, when 23 people are in the room, the probability of all of them having different birthdays is

364/365 x 363/365 x 362/365 x . . . x 343/365.

This works out to be 0.492. (It is when you have 23 people that the above product first drops below 0.5.) Thus, the probability that at least two of the 23 have the same birthday is 1 - 0.492 = 0.508, better than even.

Now for the problem of the birthdays different from yours. Pick any person at the party. The probability of that person having a birthday different from you is 364/365. (Again, I'm ignoring leap years, for simplicity.) Thus, if there are n people at the party besides yourself, the probability that they all have a different birthday from you is (364/365)n. (Since we don't have to worry whether their birthdays coincide or not, we don't have to count down 364, 363, 362, etc. as we did last time.) The first value of n for which the number (364/365) n falls below 0.5 is n = 253. And that's all there is to it.

That was all pretty easy. A far more difficult question is: How did that error find its way into my book? The answer is, I don't really know. A manuscript for a trade book (i.e., a book which the publisher thinks will have general appeal) goes through several stages of editing before it goes to press. In addition to the author's own editing, the commisioning editor for the publisher usually goes through it carefully, generally making changes on each page to improve the wording, and then the copy editor goes through it looking for any remaining grammatical or stylistic errors. Everyone involved tries to shorten and simplify the text to make it as accessible as possible, cutting out any redundant prose. Having written several trade books now, and gone through this process each time, I try to preempt as many editing changes as possible by doing my own editing before sending the manuscript off to the publisher, particularly as production schedules usually mean that the author has only a few days to check over the changes suggested by the two editors. In the case of my book Mathematics: The New Golden Age, I did have to hold up publication for a week or so when I discovered that a particularly keen copy editor, unfamiliar with mathematics, had replaced all my carefully crafted statements of mathematical results with more colloquial forms that, while I agree they read much more smoothly, were logically incorrect. In the case of The Maths Gene, however, I suspect that the original error with the birthday paradox was mine, as I cut down what had started as a rather lengthy discussion of the problem to just a couple of sentences. (I was faced with reducing a 120,000 word manuscript to the 100,000 words I had contracted for with the publisher. To be honest, rough as this process might seem at time, it beats coalmining as a way to make a living.) Looking back now, I see that an earlier version contained not only discussions of the numbers 23 and 254, but also the crucial passage:

The answer 183 is the correct answer to a very different question: How many different birthdays do you need to have represented at a party so that there is a better-than-even chance that one of them will be your birthday?

This is correct. Somehow, in the heat of the editing, by the time I got to page 271 (the very last page of the main text), either I was too liberal with my red pen or else I marked up the cuts too scrappily for the typesetter to make sense of. In any event, the error snuck in. The really annoying thing is that I passed over this error on the two or three subsequent times I checked over the manuscript. The law of small errors had struck again. The only redeeming factor on this occasion is that the US edition of the book is not due out until August, so I can correct the error before it has an opportunity to confuse or mislead American readers.

Let me leave you with another probability question. This one is for anyone who sets out to write a book. What do you think is the probability that the published version will contain at least one error? I know what the answer is in my case. It's unity.

Devlin's Angle is updated at the beginning of each month.
Keith Devlin ( devlin@stmarys-ca.edu) is Dean of Science at Saint Mary's College of California, in Moraga, California, and a Senior Researcher at Stanford University. His latest book The Maths Gene: Why Everybody Has It But Most People Don't Use It (complete with error), is published in the UK this month by Weidenfeld and Nicolson. The American edition, The Math Gene: How Mathematical Ability Evolved and Why Numbers Are Like Gossip, will be published by Basic Books in August.