This series of scientist-written essays explores the benefits and challenges of data-sharing and open-source technologies in neuroscience.
The field of neuroscience has witnessed a sea change in its attitude toward open science over the past 10 years. Thanks to mandates from journals and funders, the establishment of large-scale public repositories, and broader shifts in academic culture, it is now routine for many researchers to deposit data for use by anyone, anywhere. This practice has numerous benefits—including secondary analyses of data, the discovery of errors and the development of hands-on pedagogical materials. But current practices for data-sharing often fall short.
As a user of deposited data, I frequently find myself poring over repositories that are extremely challenging to navigate. Many repositories lack sufficient metadata, and it takes a significant amount of sleuthing—and usually multiple emails with the authors—to decipher how the data are formatted. Also, the organization of the data is often idiosyncratic to a particular repository, making it difficult to apply standardized software tools. Most of my time is sunk into preprocessing.
There are several explanations for this messy state of affairs. Trainees who produce the data have not been tutored in this aspect of their work, nor does academic culture value it.