Thursday, October 23, 2014
Abstract: The 21st century will be considered the century of data. Data are being collected by virtually all scientific, educational, governmental, societal, and commercial enterprises leading to not only an explosion in the amount of data but also in the diversity, complexity, and velocity of that data. The opportunities for information and knowledge extraction from these data are enormous; however there are challenges to such information extraction at scale. In this talk I will outline several of the key challenges including the reproducibility and verifiability of statistical results, reliance on big data findings in public discourse and decision-making, and privacy considerations. I will then motivate solutions based in emerging computational tools, policy, the practice of science, and statistical methods.
Biography: Victoria Stodden is an associate professor in the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign, and affiliated with Columbia University. Stodden has served on the National Academies of Science committee on "Responsible Science: Ensuring the Integrity of the Research Process.” She has developed software platforms (ResearchCompendia.org, RunMyCode.org, SparseLab.stanford.edu), written many articles and recommendations (e.g. the "Reproducible Research Standard"), and testified before the Congressional House Science, Space, and Technology Committee on Scientific Transparency and Integrity. In 2014, she published two co-edited volumes, Implementing Reproducible Research (see https://osf.io/s9tya/wiki/