Any scientist interested in analysis of genetic data knows the value of online repositories for sequence data (such as genBank). Any researcher publishing analyses of sequence data likely is required by publishers to make their sequence data publicly available in such a database. Not only does this allow for other researchers to validate one's work, it also allows for more general research that may use those sequences. In fact there is a growing number of research programs that revolve solely around mining the data stored in such repositories.
Michael Whitlock and colleagues, in the current issue of the American Naturalist, discuss the importance of data archiving in Ecology and Evolutionary Biology. There are new web-based archiving services that allow the storage of general datasets, and the authors argue that science has lost a tremendous amount of data, and there is no reason for the loss of data in an era where online data storage is so cheap and accessible.
Within the next year, several journals (American Naturalist, Molecular Ecology, etc..) will require the sharing of general data sets, similar to the genetic data sets stored at GenBank. For example, the American Naturalist's policy will read:
This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species.
This is a fantastic idea. The open sharing of data will not only promote scientific integrity, it will also allow for expansive meta-analyses and synthesis of diverse data sets. And in the end, most work published in these journals is funded by the tax payers of various countries, and they should be allowed access to the data that they are paying for...
Whitlock, M., McPeek, M., Rausher, M., Rieseberg, L., & Moore, A. (2010). Data Archiving The American Naturalist, 175 (2), 145-146 DOI: 10.1086/650340
