G. Fertin, A. Labarre, I. Rusu, É. Tannier, and S. Vialette
Publisher: MIT Press (2009)
Details: 288 pages, Hardcover
Price: $40.00
ISBN: 9780262062824
Category: Monograph
Topics: Combinatorial Optimization, Combinatorics, Mathematical Biology
[Reviewed by John D. Cook, on 07/14/2009]
One of the first questions one may have about Combinatorics of Genome Rearrangements by Guillaume Fertin et al is how to classify the book. To what extent is it a mathematics book or a biology book? What background is necessary to read it? Combinatorics is concerned with problems motivated by biology and uses vocabulary derived from biology, but otherwise is not about biology per se. The content of Combinatorics is primarily mathematics and computer science — combinatorial analysis and the efficiency of algorithms. The first chapter contains a quick introduction to the biology necessary to read the book and the mathematical prerequisites are minimal.
Although the book is largely self-contained, it can be difficult to follow. The presentation is condensed; the book reads more like an outline or a reference than a textbook. Combinatorics is long on definitions but short on examples.
One of the primary goals of applying combinatorial reasoning to genetics is to quantify the degree of similarity between two genetic sequences in a biologically meaningful way. One way to compare two sequences is to place the sequences side by side and simply count the number of locations in which the sequences differ. For example, consider the following two sequences.
ACATTAGTCAGTCGCGCTCGCTGC
ACACGCTCGCGCTGACTGATTTGC
The sequences differ in 15 positions, as indicated by underlined characters. Said another way, fifteen changes would be necessary to transform one sequence into the other. But when viewed from a different perspective, the sequences only differ in one way: the middle segments is reversed.
ACGTAGTCAGTCGCGCTCGTGC
ACTGCTCGCGCTGACTGATTGC
Because genetic sequences are sometimes reversed during reproduction, the difference between the two sequences could plausibly be explained by a single reversal. We'd like to define a distance between sequences that measures the number of biologically plausible "moves" necessary to transform one into another. Basing distances on a set of biologically plausible moves gives genetic combinatorics a character distinct from combinatorics in general.
Biologically-inspired metrics may be difficult to compute; depending on the set of "moves" being modeled, it may be difficult to determine how many moves are required. This is why the book is concerned with the efficiency of algorithms. The book discusses numerous algorithms whose analysis is a matter of current research. Other algorithms in the book have been studied thoroughly but are known to be "hard" in a well-defined sense (i.e. NP-complete).
Combinatorics is inspired by applications to genetics but is not limited to such applications. For example, strings are not limited to the four characters representing the four possible base pairs of DNA. Theorems are stated and proved more generally. One could imagine results from combinatorial genetics being applicable to analogous situations outside of biology. Mathematical genetics has drawn extensively from mathematics and computer science; it is interesting to see the genetics returning the favor by contributing back to these other disciplines.
John D. Cook is a research statistician at M. D. Anderson Cancer Center and blogs at The Endeavour.