Fundamental biology is hard, and in many ways it seems to be getting even harder. Following the discovery of the double helix structure of DNA in 1953, a “central dogma” developed in molecular biology asserting that each gene was associated with a single protein and thereby had a single function. Things have turned out to be both richer and far more complex. Gene regulation may prove to be at least as important as the genes themselves in determining, for example, the differences between species. Gene regulation in chimps and humans appears to be significantly different, although our basic gene repertoire is nearly identical.
Biologists’ view of evolution has changed considerably as well over the last several decades, motivated at least in part by these new discoveries in molecular biology. The original tidy evolutionary tree envisioned by Darwin is now seen as a gross simplification. A network is often a better model because a single species may have multiple ancestor species corresponding to its different parts. For example, a species may have one ancestor for its nuclear genome and another for its mitochondria.
Evolution also acts on multiple scales, from DNA sequences and proteins up to populations and species. This book, Reconstructing Evolution: New Mathematical and Computational Advances, focuses on phylogenetics. Simply put, phylogenetics is the study of evolutionary relationships among groups of organisms such as species or populations. To “reconstruct evolution” is to identify phylogenetic trees or networks that might help explain the history of evolutionary processes. Since evolution occurred only once and virtually all the events are in the far distant past, we have few direct observations and little in the way of experimental results. Fortunately, mathematical models give us one good way to approach the study of our phylogenetic past.
This book originated from the Mathematics of Evolution & Phylogenetics meeting at the Institut Henri Poincaré in 2005. It is divided into ten chapters that the editors group into five main topics: evolution within populations, models of sequence evolution, biodiversity (speciation and extinction), construction of phylogenetic trees from partial information, and network models. The mathematics employed here covers a broad range, but stochastic processes, especially Markov processes, play a big role. Markov Chain Monte Carlo methods are used extensively for importance sampling of difficult distributions. Combinatorics and graph theory also get a lot of play. Algebraic geometry and computational algebra are surprisingly relevant to comparisons of segments of DNA sequences. The key here is some sophisticated pattern matching guided by statistics and aided by computational algebra.
Unquestionably, this is a text directed toward experts. The reader will need some background in biology to make sense of any of the papers here. None of the chapters is primarily expository. At least a passing acquaintance with the basic concepts of molecular biology (and usually more) is necessary. This can be frustrating for someone trying to learn about the field. I’m not aware of any text in the area of phylogenetics appropriate for a mathematician wanting to learn the biology. One place to start for an interested reader without a biology background is a text such as World of the Cell by W.M. Becker et al. A book that focuses on dynamical processes in evolution is Evolutionary Dynamics by Martin Novak. It is accessible at the advanced undergraduate level and demands far less background in biology. It does, however, focus on the game-theoretic aspects of evolution and says little about phylogenetics.