Saturday, July 05, 2014

Journal Open Data Guidelines: plenty of room for clarifications

J. Gray, Wikipedia. CCZero.
Several journals are playing with statements about Open Data, and, for example, F1000Research and require Open Data. When publishers are judged in their implementation on Open Access, so should we critically analyze journals that claim to be an Open Data journal. Well, such claims I have not seen, but some journals have promising statements, like:
BioMed Central
    Data associated with the article are available under the terms of the CCZero.
However, this claim is vague, or, at least, too vague for a paper I am currently reviewing. The fuzziness lies in the word "associated". What defines associated data? How does this relate to reproducibility? If the purpose of Open Data is that the results of the paper can be reproduced, it means all data? And what happens if some of the data is from a previous paper? Or from a proprietary database? Is a paper that has data from proprietary database as key steps in the argumentation acceptable to a data that demands Open associated Data? What if the authors do not have control over the the license? Or is it limited to new data? But what defines new data here? Because it is a really hard question in an era where data has very limited provenance (versioning, author attribution, etc).