Years ago, when I was a student at MechMat (Department of Mechanics and Mathematics) at the Moscow State University, my brother-in-law used to pester me with questions like who — in my opinion — was a better mathematician, Andrey Nikola’ivich Kolmogorov or Pavel Serge’evich Alexandrov. Although I would usually have an idea who they might be, I never saw a point in pondering such questions, let alone answering them. I could not fathom why would anybody be interested in putting a worthless tag on a couple of great mathematicians. Times change: if my brother-in-law were alive today, he would be able to easily satisfy his curiosity without bothering anybody personally.
Steven Skiena and Charles Ward — the two authors of the book Who’s Bigger? — have set up a web site whoisbigger.com that has a tool (whose rationale and workings are described in the book) where the user can enter a person’s name and receive a few calculated ranking attributes for that person. E.g., this is what one would have if asking about “Pavel Alexandrov” and then “Andrey Kolmogorov”:
This tells us that, according to Skiena and Ward, Kolmogorov beats Alexandrov (the smaller the number the higher the ranking) on all the indicators they employ. All emotions aside — which is a convenience — this is probably right. Alexandrov made major contribution to topology, Kolmogorov is a founder of modern probability, with fundamental contributions to complexity, algorithmic information theory, turbulence, logic, and also topology.
Cool, unemotional numbers. Where do they come from?
For their main source of data, the authors chose the English version of Wikipedia on October 11, 2010. Seen as a network of linked pages, Wikipedia is a fertile ground for applying the PageRank algorithm, the foundation of the Google search engine. The PageRank algorithm was applied to Wikipedia as a whole and, the second time, only to the pages devoted to people. A combination of the two ranks was defined as gravitas — the concept that, according to Skiena and Ward, captures the degree of name recognition based on the subject’s accomplishments. They designated a combination of the page length, the number of edits, and the number of hits for the subject’s page as celebrity (indicator). The two combine into the fame coefficient.
Realizing that the fame is fickle, subject to temporal fluctuations and bounds, Skiena and Ward tweak their numbers to balance the significant advantage that present day figures have over the departed ones. This is done with the help of the Google Ngram project. By October 2010, Google had digitized 15 million books and made them available for research, in particular, they make it possible to observe historical trends. Google’s Ngram graphs report annual frequencies of short input phrases that occur in their scanned book library. Here’s one example that show transportation trends over a two hundred years period:
The graphs for “plane” and “aircraft”, although clearly correlated after 1910, differ over earlier years due to the word “plane” having multiple meanings; the fact that points to the difficulties inherent in linguistic searches.
The normalized indicators tempered with information culled by the Ngram engine are collected into a single value — the Significance — the ultimate basis for Skiena and Ward’s ranking. (They found that the sixth indicator News Frequency garnered from browsing of more than 500 news sources, had little effect on their calculations.)
The authors make a convincing case for the chosen nomenclature, obviously selected by association. Of course, as in mathematics, the terms adopted from plain language do not strictly reflect their daily usage, which in most cases is rather ambiguous anyway.
Skiena and Ward compare their ranking with a multitude of others, find correlations and in many cases see improvements. I absolutely agree with their verdict in Chapter 3 that 250 historic figures highlighted in the fifth grade Social Studies textbook could have been chosen more judiciously than was done by the team of experts from Pearson by using this ranking system.
The book consists of three parts. In the first seven chapters, the authors explain the construction of their algorithms and deploy them “to study several ‘Big Picture’ questions in history and popular culture.” The next eight chapters “present rankings of people within specific spheres of influence, for example, world leaders, athletes, artists, or outlaws.” Here’s a typical page from the second part of the book (click to enlarge):
A ranking of 10 people that influenced most the protestant christianity. The first column gives their Significance, as defined by the authors; the C/G column depicts the relative values of the Celebrity and Gravitas indicators; that is followed by a short (45 characters at most) blip presenting the person. In the text, there is a short introduction and more extensive details for the top people in the list. As the authors put it, “the 1000 names we drop in the book make it a crash course in historical Who’s Who.” This also makes it a great tea table book and an excellent conversation starter.
The third part consists of three Appendices that add more technical information, describe in greater detail data sources, and provide biographical dictionary for the top ranking individuals.
Skiena and Ward draw a line at what they could possibly be able to achieve (p. 11):
We are not historians, and this is not a history book. History is the study of past events, with the emphasis on what happened and why. We have nothing to say in this book that will help historians reconstruct past events, or understand the driving forces that made them unfold.
And further (p. 41):
The value of our significance ranking is in quantifying phenomena, providing evidence that, despite all its flaws and biases, correlates well with a lot of external references and standards.
Their declared goals appear modest (p.7):
We don’t expect you will agree with everyone chosen for the top 100, or exactly where they are placed. But we trust you will agree that most selections are reasonable: a mix of famous people including major pillars of Western civilization.
They are also well aware of possible objections to their line of thinking (p. 40):
First, we have not defined what we mean by significance: it is just some quantity that falls out of a formula. Second, the English-language Wikipedia is inherently culturally biased: favoring American figures above those of the rest of the world. The Wikipedia authors did not leave their prejudices at the door, so any results concerning minorities and gender reflect attitudes as well as accomplishments. Finally, meme strength is in no way synonymous with achievement or importance.
To sum up, I admit to being prejudiced against the book (truly, how do you compare the significance of, say, Winston Churchill and Picasso — and what for?), I also found some rankings I disagree with (e.g. Rabbi Yehudah haNasi who is universally credited with editing Mishna — the foundation of Talmud and modern rabbinc Judaism — does not appear in the list (p. 256) of the most significant Jewish leaders.) On the other hand, I tremendously enjoyed the short blip that introduced King David: Biblical King of Israel (Jerusalem), which beautifully epitomizes a long stretch of Jewish history. So, without passing a judgment on its academic merits (I think it’s a work in progress worth pursuing), I confess to simply liking the book. I still do not care about the great order of things; nonetheless, I very much appreciate a huge amount of fascinating detail that the book makes available at one’s finger tips, and the orderly manner in which it does that.
Alex Bogomolny received his MS in Mathematics from Moscow State University located on the (renamed) Sparrow Hills and PhD in Applied Mathathemetics from Hebrew University with a (rebuilt) campus on Mount Scopus (Jerusalem, Israel) where in-between he worked as a night guard. During all that time he worked as a mathematician and a programmer and presently maintains a popular Interactive Mathematics Miscellany and Puzzles web site.