You are here

On the Trail to Shakespeare's Identity

January 25, 2010


In a recent article in Science, physicists claim to have found a "systematic text-length dependence of the power-law index γ of a single book."

In other words, the researchers say it may be possible to differentiate among the writing styles of Herman Melville, D. H. Lawrence, and Thomas Hardy. Each writer, they claim, can claim a distinctive curve that compares the number of different words in a sample to the total number of words in the writing sample.

The estimated γ values, they say, are consistent with a monotonic decrease from 2 to 1 with increasing text length. Having examined a connection to an extended Heap's law, they proposed that the infinite book limit is given by γ=1 instead of the value γ=2 expected if Zipf's law is universally applicable.

In addition, after exploring the idea that systematic text-length dependence can be described by a meta book concept, they uncovered an abstract representation reflecting the word-frequency structure within a text. According to their conception, then, the word-frequency distribution in a text has the same characteristics as a text of the same length"extracted from an imaginary complete infinite corpus" by the selfsame author.

Examining short samples, the researchers found that the number of different words increased almost as fast as the total; it increased gradually as the sample became larger. The curves started out steeply and then leveled out. Melville, who uses the biggest vocabulary, has the steepest curve. Hardy added new words at a slower rate, followed by Lawrence.

If the method works, "that would be interesting because it's such a simple statistic," observed mathematician Daniel Rockmore (Dartmouth College

Sebastian Bernhardsson, Luis Enrique Correa da Rocha, and Petter Minnhagen (Umeå University) explained their method in "The Meta Book and Size-Dependent Properties of Written Language," inNew Journal of Physics(12/10/09).

Their findings may finally prove or disprove the rumors that William Shakespeare actually wrote all the works he is credited for.

Source: Science (January 1, 2010).

Image of Shakespeare's signature courtesy of Wikipedia.

Id: 
763
Start Date: 
Monday, January 25, 2010