ASA at 175 - Policy - Science at the crossroads? Data sharing, reproducibility, and related issues
By Ronald Wasserstein - April 18, 2014
During this 175th anniversary year of the ASA, issues that have a major statistical component have been at the forefront of major policy discussions.
- Who does data belong to, especially when it is collected using federal funds?
- What are the most effective ways of sharing data so that science is advanced while protecting the intellectual property of those who collected it and any personal data it might contain?
- How do we make sure research results are widely available, especially research that was supported by federal funds?
- How do we make it possible to check the reproducibility of research results? (By “reproducibility” we mean that analyzing the same data the same way gets the same results.)
- How can we make sure the work is replicable? (If the same experiment is conducted another time, do we get the same result?)
- How generalizable is the result? (If we a similar experiment - for example, with a different population - do we get a similar result?)
These issues relate to fundamental principles of science, and while they can be simply stated, they are complex and interconnected. The answers vary by discipline. Indeed, the meanings of the words themselves in several of these questions vary by discipline.
The scientific concerns have been well chronicled. A 2005 paper by John Ioannidis (“Why Most Published Research Findings Are False” DOI: 10.1371/journal.pmed.0020124) placed the problems squarely in the public eye. Nature has written extensively about reproducibility, and has collected and made these articles free available in a collection called “Challenges in Irreproducible Research.” (See also “Reporting standards to enhance article reproducibility” in the Nature Methods blog.) Earlier this year, Science wrote an editorial (“Reproducibility” DOI: 10.1126/science.1250475) that explicitly mentions interacting with the American Statistical Association to help improve the scrutiny of manuscripts involving statistical analysis.
The issues about data sharing are well chronicled as well. See, for example, an article in Nature earlier this year (“US science to be open to all” doi:10.1038/494414a), and ASA Director of Science Policy Steve Pierson’s April 1, 2013 blog.
This brings us to the reason for today’s blog. To address many of these matters from a statistical perspective, the ASA has formed an ad hoc Committee on Data Sharing and Reproducibility. As a first activity, the committee has connected with the ASA’s Committee on Publications to explore ways to promote the sharing of data in articles published in ASA’s journals. Also, as noted above, we are connecting with Science magazine to engage statisticians in the review of articles submitted to the magazine. In addition, I met yesterday with the leadership of many of the organizations involved in COSSA and with Robert Kaplan, currently the director of the Office of Social and Behavioral Science Research at the National Institutes of Health, to talk about the roles professional societies can play with regards to solving these critical problems.
Statisticians see much to be concerned about in these areas of reproducibility, replicability, and generalizability. Are there ways you can contribute to addressing these matters? If so, connect with us by contacting Steve Pierson (firstname.lastname@example.org).
In 2014, the American Statistical Association is celebrating its 175th anniversary. Over the course of this year, this blog will highlight aspects of that celebration, and look broadly at the ASA and its activities. Please contact ASA Executive Director Ron Wasserstein (email@example.com) if you would like to post an entry to this blog.