Truth is, I find myself sucked in as much as anyone else. I could simply ignore the polls but I don't. The reason is that they actually do tell us something. Not how the election will turn out, of course - no one can do that; rather, they tell us how our fellow citizens say (at the time of the poll) they intend to vote. To the degree that declared intentions indicate subsequent action, and barring unusual circumstances, the former can be inferred from the latter. If this were not the case, surely none of us would pay the opinion polls much, if any, attention.
The fact is, whether we like it or not, opinion polling is now a major part of our life, and has been for many decades. We take for granted the fact that by asking a tiny fraction of the population - perhaps as few as 1,000 Americans - we can obtain a fairly reliable indication of how an entire state will vote. Yet if you stop and think about it for a moment, that is a remarkable fact.
Even more remarkable, to my mind, is the very notion that we can make any prediction about a future event such as an election, or how a roll of two dice will come out, or what might happen to the stocks in our retirement fund. What's that you say? What is remarkable about that? After all, you say, no one is claiming that we can ever know for sure what tomorrow will bring. Rather what polling - and other predictive techniques - do is put numerical values on the various likelihoods of future events. What's the big deal about that? Surely, anyone with even the most basic mathematical training will accept that you can assign probabilities to future events. Right?
True - today. But that's a fairly recent state of affairs. Aristotle - who certainly was no slouch when it came to math - believed, and wrote, that one realm where mathematics could not be applied was the future. The future was unpredictable to man, known only to the gods.
And so everyone believed until 1654, when the great French mathematician Pierre de Fermat solved the problem of the Unfinished Game, a topic I touched on briefly in last month's column.
Pacioli was unable to solve this problem. So too were a number of other mathematicians (and gamblers) who tried, including Girolamo Cardano, Niccolo Tartaglia, and Lorenzo Forstani. The consensus was that the problem could not be solved.
Then, early in 1654, a gambler by the name of Antoine Gombaud, more often referred to in modern history books by his French nobleman's title of the Chevalier de Mere, asked his friend the mathematician Blaise Pascal. Pascal produced a complicated argument that can be made to work, but was not happy with it, so at a friend's urging he wrote to Fermat about it. Fermat quickly found a simple solution.
There are two rounds left unplayed, argued Fermat. In each round, either player can win, so there are in all four different ways the game could continue to its five-round completion. The player who has won one round to the other's two must win both those final rounds in order to win the contest; in the other three possible endings, the player who is ahead after three rounds will win. Therefore, said Fermat, the player who is ahead when the game is abandoned should take 3/4 of the pot, with the other player taking 1/4.
To anyone who sees this solution today, it seems simple enough. (The solution assumes the tournament is thought of as a "best-of-five" rounds, as opposed to a "first-to-three". You need a slightly more complicated argument in the latter case, but the answer is the same, a 3 to 1 division of the pot.) But no one before Fermat saw it, including Cardano who did work out all of the basic rules we use today to combine probabilities. Moreover, when he did see Fermat's solution, Pascal could not accept it, and nor could various of his colleagues he showed it to. What was their problem?
Since the computation is trivial, indeed no different from the calculation of the odds in any game of chance (and actually much simpler than many), the only thing that could be holding everyone back was the fact that what Fermat was counting were "possible futures." Something that two thousand years of received wisdom said was not possible.
Once word got out about Fermat's breakthrough, however - presumably through the highly mobile network of gambling European noblemen - it did not take long for others to jump into the "future prediction" act. Within a single lifespan, modern future prediction and risk management were in place.
The first known opinion poll was in 1824, when the Harrisburg Pennsylvanian newspaper conducted a local poll that showed, incorrectly it turned out, Andrew Jackson was leading John Quincy Adams in the presidential race. (Jackson became president next time round.)
The first national poll was in 1916, when the Literary Digest predicted - correctly - that Woodrow Wilson would be elected. Their approach was to mail out millions of postcards and count the returns. This is now recognized as a woefully unreliable method, but the magazine managed to correctly predict the following four presidential elections this way before getting it badly wrong in 1936, when it erroneously predicted that Alf Landon would beat Franklin D. Roosevelt.
The difficult part - or rather, one difficult part - of conducting a reliable poll is making sure that the sample is random. The math requires this. Using a non-random sample was a large part of the reason why the 1936 Digest poll came unstuck. That same year, George Gallup conducted a much smaller poll based on a properly representative sample and got the right answer.
Another famous case when the pollsters got it wrong was in 1948, when major polls, including Gallup, indicated that Thomas Dewey would defeat Harry S. Truman in the presidential election in a landslide victory. As we know, Truman came out on top, and there is a famous photograph of a smiling Truman holding up a first-edition Chicago newspaper that had a big headline saying "Dewey wins."
The problem that time was that the pollsters relied on telephone interviews, and in those days only wealthier people had phones, and so the sample was heavily biased toward Dewey supporters.
Most people are surprised by how small a random sample can be and yet still yield a reliable result. If you do the math, you find that (provided the sample polled is truly random) 1,000 people will give you a prediction accurate to within a 3% margin of error. You could get the error down to 1% if you polled 10,000, but the more people you poll the more expensive it gets, of course. 1,000 is a number typically used these days.
In recent elections, with phones no longer restricted to the more wealthy, phone interviews seem to have done pretty well. But they do leave out people whose only phone is a mobile phone, and as more and more young people get to voting age, that could become a significant factor.
The kinds of polls you see on news organization web sites that ask people to vote on an issue are extremely unreliable, because the population sampled is self-selected. Polling results are reliable only when the sample is chosen in a truly random fashion.
Finally, while on the topic of applying mathematics to predicting elections, I can't pass up the opportunity to point you to what is surely the most cerebral political ad video in this year's campaign. It's titled "The Theorem". I point you to it with no intended or implied endorsement, etc. etc.
Mathematician Keith Devlin (email: [email protected]stanford.edu) is the Executive Director of the Human-Sciences and Technologies Advanced Research Institute (H-STAR) at Stanford University and The Math Guy on NPR's Weekend Edition.