# Ivars Peterson's MathTrek

December 18, 2006

### Rankings, Tournaments, and Playoffs

So many teams and so little room at the top. Which team becomes the national champion in U.S. college football rests on rankings, which reflect the opinions of poll participants (and, nowadays, also computer ratings).

This year, Ohio State plays for the national title in the championship bowl game. And its opponent will be Florida rather than Michigan because the "experts," in their voting, judged that Florida would be the more worthy opponent.

This outcome hasn't pleased everyone, and, as happens nearly every year, many have criticized the vagaries of the ranking system for allowing apparently flawed or unfair outcomes.

Similar problems in determining which team or player deserves a national or year-end championship or how they ought to be seeded for a tournament occur in other sports that also employ elaborate rating schemes to rank teams or players.

In a paper published in a recent issue of SIAM Review, Paul K. Newton and Kamran Aslam of the University of Southern California argue against the widespread belief that it is possible, with just the right tweaking, to come up with a ranking system that yields reasonable results and eliminates logical inconsistencies—and, hence, settles all arguments, leaving everyone satisfied.

"The philosophy behind these systems is that there should be a player or team that 'deserves' to be recognized as 'the best,' and if only the correct method were found, such a team could be unambiguously chosen," Newton and Aslam write.

But it's impossible to make such a guarantee. The argument hinges on an application of a theorem proved by economist Kenneth Arrow. He showed that, under certain reasonable assumptions, there is no method for constructing social preferences (rankings) from arbitrary individual ones (votes).

One such assumption is that, when team A is ranked higher than team B, and team B is ranked higher than team C, then team A is ranked higher than team C. This seems like a reasonable requirement.

But voting schemes can readily undermine this desirable result, even when just three teams or players are involved.

Newton and Aslam cite the example of the selection of the top men's tennis player in 2002. That year, Pete Sampras won the U.S. Open, Andre Agassi won the Australian Open, and Lleyton Hewitt won Wimbledon. Suppose that three judges voted for player of the year as shown in the table below.

 Judge Sampras Agassi Hewitt American 1 2 3 British 2 3 1 Australian 3 1 2

Note that two out of the three judges ranked Sampras ahead of Agassi, two out of three put Agassi ahead of Hewitt, and two of three put Hewitt ahead of Sampras!

"Such outcome-based methods based on voting, as the one used to crown the NCAA national football champion, very often produce logical inconsistencies that are the basis for arguments that cannot be settled rationally," Newton and Aslam contend.

They add, "The amount of energy and effort spent on arguing over rankings of all types (particularly in college football, where few games are played compared to the total number of teams involved, but also regarding the notorious U.S. News and World Report Annual Ranking of Colleges) is an indication of the pervasiveness of Arrow's theorem."

Newton and Aslam favor a different, probabilistic approach for ranking teams or players, which makes predictions about outcomes in head-to-head competitions. To rank tennis players, for instance, the idea is to run thousands of simulated tournaments with players randomly ordered in fictitious draws before a real tournament begins and to use the accumulated statistical winning distributions as the basis for seeding the actual tournament. The player most likely to win the tournament based on the simulations would become the top seed in the real draw, and so on.

The researchers describe how to do such Monte Carlo simulations for tennis. They argue that, with enough computer power, such simulations could be run for college football as a way to rank the teams.

Although Newton and Aslam don't specifically call for a playoff system, it's easy to see that the top eight teams, as determined by such simulations at the end of the season, could then vie for the national championship in a set of playoff games.

"Moving toward systems that are probabilistic and predictive would finesse the inherent inconsistencies guaranteed by Arrow's impossibility theorem," Newton and Aslam conclude, "and would better reflect the reality that a victory of a lower-ranked player over a high-ranked one in a single match is not necessarily an inconsistent outcome."

References:

Klarreich, E. 2002. Election selection. Science News 162(Nov. 2):280-282. Available at http://www.sciencenews.org/articles/20021102/bob8.asp.

Newton, P.K., and K. Aslam. 2006. Monte Carlo tennis. SIAM Review 48(No. 4):722-742. Abstract available at http://dx.doi.org/10.1137/050640278.

Peterson, I. 2005. Ranking college football teams. MAA Online (Nov. 14).

______. 2005. Winning at tennis. MAA Online (June 13).

______. 2004. College football, rankings, and wandering monkeys. MAA Online (Sept. 8).

______. 2003. Election reversals. MAA Online (Oct. 20).

______. 1998. Who's really no. 1? MAA Online (Dec. 14).

______. 1998. How to fix an election. MAA Online (Nov. 2).

______. 1997. Getting slammed in tennis. MAA Online (Dec. 1).

Stefani, R.T. 1997. Survey of the major world sports rating systems. Journal of Applied Statistics 24(December):635-646. Abstract available at http://dx.doi.org/10.1080/02664769723387.