You are here

A Course on the Web Graph

Anthony Bonato
American Mathematical Society
Publication Date: 
Number of Pages: 
Graduate Studies in Mathematics 89
[Reviewed by
William J. Satzer
, on

This book developed from lecture notes for an Atlantic Association for Research in the Mathematical Sciences summer school graduate course called Massive Networks and Internet Mathematics. The web graph W, a model of the internet, has vertices representing web pages and edges representing the links between pages. It is massive and evolving, sparse and self-organizing, with both small world and power law properties. In addition, it is curiously ephemeral since vertices and edges appear and disappear over time.

Over the last fifteen years the web graph has been the object of intense experimental and theoretical work. In broad terms, the consensus of that work is that the web graph is a highly interesting object in itself, that there are rigorous mathematical models of its properties, and that one can exploit the graph structure of the web to search it for information.

A Course on the Web Graph is directed at a broad technical audience. It is appropriate for graduate students in mathematics, computer science, engineering or physics. Prerequisites are some elementary graph theory, linear algebra and probability. The text would also be of value to professional mathematicians, scientists, and engineers interested in the web graph and related graph theory. Although there is some general discussion of the internet and its technology, the focus is clearly on the mathematics of the web graph.

The first five of the book’s seven chapters could form the core of a one semester course. The first chapter reviews background material and establishes notation for basic graph theory and discrete probability. Chapter 2 introduces the web graph and describes its basic properties. The author suggests that one way to look at the web is to imagine it as a bow-tie structure. The knot of the bow-tie consists of the strongly-connected component, with one side of the bow-tie composed of pages that link to the core, and the opposite side pages that have links from the core. (It would be a strange looking bow-tie, since recent estimates place more than two-thirds of all web pages in the core.) A more recent view of the web is to see it as an infinite graph; this perspective is motivated by the increasing number of dynamically changing web pages.

Chapter 3 introduces the idea of a random graph and describes properties of the classical space of random graphs G(n, p) with n vertices and probability p that any two vertices are connected. Since random graph theory provides the basis for much of internet mathematics, the author devotes lots of attention to techniques for analysis and characterization. The next chapter looks at stochastic models for the web graph. Here the author introduces preferential attachment models and provides a detailed rigorous analysis of one such model. Chapter 5 is devoted to searching. The author discusses web ranking algorithms including Google’s PageRank and a few of the leading competitive algorithms. All the necessary machinery — including Markov chains and the Perron-Frobenius theorem — is included.

The final two chapters take up some advanced subjects. Chapter 6 deals with infinite graphs, including infinite random graphs. (Throughout the rest of the book only finite graphs are considered). This chapter is more abstract in content than the preceding ones; infinite graph theory has a distinctly different feel than finite graph theory. (It is a little strange to see the Axiom of Choice invoked in the context of the internet.) In addition, the applications of infinite graph theory to the internet are relatively recent and not yet mature. The final chapter briefly considers three additional advanced topics: spectral methods in graph theory, modeling viruses on the web, and dominating sets on the web graph.

The text is largely self-contained and would be suitable for self-study. This is a solid effort, well organized and clearly written. The level of presentation is consistent with a first year graduate course. There are a modest number of exercises (about a hundred overall). The bibliography is particularly strong.

Bill Satzer ([email protected]) is a senior intellectual property scientist at 3M Company, having previously been a lab manager at 3M for composites and electromagnetic materials. His training is in dynamical systems and particularly celestial mechanics; his current interests are broadly in applied mathematics and the teaching of mathematics.

The table of contents is not available.