You are here

Data-Driven Statistical Models

June 4, 2007

Using mathematical concepts from inverse scattering and modern statistics, researchers have developed a methodology in which data automatically guide the development of an appropriate statistical model. Applied to seismic data, for example, the resulting model can provide information about the shape and temperature of Earth's core-mantle boundary.

"We let the data 'speak,' and automatically generate an appropriate model," says statistician Ping Ma of the University of Illinois at Urbana-Champaign. Ma and his colleagues describe their methodology and its application to the earth sciences in the March 30 Science and in the Journal of Geophysical Research.

To investigate features deep within the planet, Ma and his colleagues apply a numeric technique known as inverse scattering, which uses the characteristics of scattered waves to reconstruct the structures responsible for scattering the waves. To do so, the researchers develop a generalized Radon transform of, say, seismic network data, mapping the sensors to multiple images of the same target structure.

The researchers then use "mixed-effects" statistical models to enhance the Radon transform images, taking advantage of redundancy in the data. The technique automatically converts vast amounts of network data into statistical estimates and quantitative predictions.

This data-driven statistical methodology is also applicable in computational biology for, say, discovering unique patterns of gene expression in fruit flies and other organisms.

Source: University of Illinois at Urbana-Champaign, May 23, 2007

Id: 
95
Start Date: 
Monday, June 4, 2007