Ivars Peterson's MathTrek

December 14, 1998

Who's Really No. 1?

It happens every fall. Fierce arguments erupt over which U.S. college football team is the best in the nation. As the season progresses, this frenzy of head scratching and navel gazing mounts until the climactic bowl games at the end of the year (more or less) settle the issue.

This year, the controversy featured a mathematical element, in the form of a complicated formula for determining the two teams that would play for the national championship. Based on various poll results and other factors, the formula represented an effort to avoid situations such as last year's split decision, when Michigan topped one and Nebraska the other of the two main national polls.

Those two polls have long been objects of criticism. The Associated Press set of rankings represents the views of a select group of sportswriters and commentators; the USA Today/ESPN poll represents the opinions of college football coaches. Allegations of bias, cronyism, and petty politicking have often tainted the results.

Computer-based, formula-driven methods have the appeal of a clean mathematical answer to a messy human situation. Indeed, the idea of using mathematics to come up with an objective set of rankings isn't new.

Similar thoughts had occurred to mathematician James P. Keener of the University of Utah in Salt Lake City after the 1984 college football season. That was the year when sportswriters and coaches voted Utah rival Brigham Young University (BYU) the national title on the strength of its undefeated season. The victories, however, had come against generally undistinguished opponents.

Perturbed by the result, Keener set out to see whether a mathematical scheme, which automatically takes into account the strength of a team's opponents, would provide a more satisfactory answer.

Keener turned to a ranking scheme based on a relatively obscure mathematical result known as the Perron-Frobenius theorem. The theorem furnishes a recipe for evading an apparent chicken-and-egg situation--the difficulty of determining strength ratings without knowing how teams rank and ranking teams without knowing their relative strengths.

In the simplest possible scheme, you can assign a single point for a win, half a point for a tie, and zero for a loss, and calculate rankings on this basis. College football, however, poses a problem because it involves a large number of teams, which individually play relatively few games. Moreover, all the credit goes just to the winner, whether the score is close or lopsided.

Keener chose to allocate the value per game between the two competing teams on the basis of the game score, and he explored various ways of making this distribution. Each method he looked at showed certain biases, rewarding or penalizing close defensive contests, wild shoot-outs, blowouts, and other types of game outcomes in different ways.

Interestingly, when Keener took a mathematical look at BYU's 1984 undefeated season, the team finished out of the top 10. Indeed, no objective ranking scheme of all the ones that Keener tried made BYU number one or even number two.

Nowadays, a number of newspapers, including the New York Times and Seattle Times, publish team rankings and ratings based on proprietary mathematical formulas. These ratings are intended to provide a numerical measure of a team's relative strength, and they have some value in predicting by how much one team would beat another if they played against each other at a neutral site.

The fine print accompanying these newspaper charts notes that the analyses are based on each team's scores, with a weighted emphasis on who won, by what margin, and against what quality of opposition. The New York Times computer model additionally collapses runaway scores, takes note of home-field advantage, and counts recent games more heavily than earlier games. The rankings compiled by Jeff Sagarin incorporate similar adjustments.

These computer-generated rankings inevitably show biases, reflecting human judgment about which criteria are the best predictors of future success. Anomalies persist.

In spite of these seemingly unavoidable complications, mathematical, computer-based schemes do remove some--though not all--of the subjectivity that plagues the national college polls. Once the method is selected, human emotion no longer comes into play from one week to the next in determining who's up or down in the rankings.

Despite their seeming objectivity, however, many people can't quite bring themselves to trust the computer-generated rankings. Fans find apparently counterintuitive results hard to swallow. Besides, the computer polls don't even agree among themselves.

In the end, when it comes down to determining the merit of various methods for ranking teams, the value of any particular system resides in the mind of the beholder. There is no unique, optimal way to devise an objective ranking scheme.

Moreover, there's always the temptation to fiddle with the mathematics to obtain a result more in line with human thinking, in effect enshrining preconceived notions of how things should turn out. Every scheme has particular strengths and weaknesses. Invariably, when a method is tweaked to get rid of one undesirable feature, another counterintuitive result pops up.

The new formula devised by officials of the Bowl Championship Series (BCS) to pick the two contenders for number one represents an awkward compromise--a witches' brew of the two national polls, three computer rankings, and various adjustments. It would be interesting to run computer simulations to see what sorts of anomalies could arise when the system is used.

In its inaugural year, the BCS formula did its designated job in selecting undefeated Tennessee and once-defeated Florida State for the championship game--though Ohio State fans might disagree. What would have happened if three, top-rated teams, instead of just one, had actually ended their seasons undefeated isn't clear.

In this business, there's always something to complain about. That's part of the fun.

Copyright 1998 by Ivars Peterson

References:

1998. A Fiesta formula. Washington Post (Dec. 7).

1998. Bowl Championship Series standings formula revealed.

Drape, J. 1998. Ultimate bowl arrives; so does call for a playoff. New York Times (Dec. 7). (Available at http://www.nytimes.com/library/sports/ncaafootball/120798fbc-rankings.html.)

Gildea, W. 1998. Ranking the teams is not as easy as p . Washington Post (Dec. 5).

Harville, D. 1977. The use of linear-model methdology to rate high school or college football teams. Journal of the American Statistical Society 72(June):278.

Keener, J.P. 1993. The Perron-Frobenius theorem and the ranking of football teams. SIAM Review 35(March):80.

Keller, J.P. 1994. A characterization of the poisson distribution and the probability of winning a game. American Statistician 48(November):294.

Lopresti, M. 1998. Forget Alamo: K-State deserved better. USA Today (Dec. 8). (Available at http://www.usatoday.com/sports/football/sfc/sfcfs80.htm.)

Minton, R. 1992. A mathematical rating system. UMAP Journal 13(No. 4):313.

Pells, E. 1998. Old school footbal left behind with new bowl formula.

Peterson, I. 1993. Who's really #1? Science News 144(Dec. 18&25):412.

Stern, H. 1991. On the probability of winning a football game. American Statistician 45(August):179.

Vecsey, G. 1998. Little room is left for whining. New York Times (Dec. 7). (Available at http://www.nytimes.com/library/sports/ncaafootball/120798fbc-vecsey-column.html.)

Information about the New York Times computer rankings can be found at http://www.nytimes.com/library/sports/ncaafootball/fbc-computer-rankings.html.


Comments are welcome. Please send messages to Ivars Peterson at ipeterson@maa.org.