Pages

Sunday, July 10, 2022

new: "FAIR assessment tools: evaluating use and performance"

In a year where "data available upon request" is a good intention at best (doi:10.1016/j.jclinepi.2022.05.019), the need for FAIR and Open data is more than ever. It is hard to argue that the openness with SARS-CoV-2 and COVID19 data has not helped at least somewhat, though we likely do not fully understand how yet. Releasing daily statistics of infections and RNA concentrations helped us for sure decide when to close and open up our societies. But how much do our COVID19 Disease Maps (doi:10.15252/MSB.202110387) help?

Meanwhile, the FAIR principes paper (doi:10.1038/SDATA.2016.18) has been enormously successful, but deciding if it had an impact less less clear. Repositories ("F") already existed but did they get better? I was actually thinking of writing "significantly better", but without a means to decide if something is better, deciding how much better is even harder. FAIR expects common transport protocols like HTTP, but those existing long before the concept of FAIR too. And I have been working on interoperability since before the start of FAIR. Also, if Excel spreadsheets still cause massive problems (doi:10.1371/journal.pcbi.1008984) despite the notion they could be FAIR (doi:10.3390/NANO10101908), the we see that no significant progress has been made beyond the people that were already doing it.

Then what about the reuse ("R")? This is where the Open is important. Well designed data is nice, but if you are not allowed to reuse it, then what's the point? But the reuse has many fascinating aspects and touches on how we do science. We know really well which data is "good enough" and what data is not. This is not always founded in objective rules, but sometimes it is. 

But all these these details of FAIR aside, the main objective is that data becomes even more interoperable and more reused than before. So when the RIVM/Gov4Nano team (Nynke Krans, Martine Bakker, and Joris Quick) approaches us about a study of tools to help make data more FAIR (doi:10.1016/j.impact.2022.100402), it got my interest. Also because it fits the vision of our research group of integrative systems biology. So, happy that Ammar was able to contribute from our side about how to get data more interopable to allow us to integrate it.

Table of Contents graphics of the article by Krans et al. 

The analysis of multiple FAIR assessment tools with two data sets give interesting results with impact on how the field needs to move forward. First, the variety is extensive and tools cannot easily be replaced by another: each tool focuses on specific thing. That said, they all help bridge the gap between data and life sciences expertise. And since discussing FAIR is the first step of making something more FAIR, this is quick welcome.

One caveat of several tools, however, is that they are not clear why a certain score is not reached. That is, they do not provide concrete actions for improving the FAIR score. On the one hand this is not surprising: these tools help you make data FAIR, but scholars should still learn what FAIR means for their research. After all, we also learned how to manage our data, how to write the paper lab notebook. It's part of our job description. However, FAIR scales up the concept of the lab notebook. It is no longer about only your own use (ie. your experiment) but also about the use of others. So, getting guidance about how others may want to reuse your data is needed. Think of it as formative assessment.

The research on these FAIR assessment tools pointed out the diversity (see also the FAIR principles follow-up paper, doi:10.1162/DINT_R_00024), but with that it also found that choosing the right tool is not always trivial. Online versus offline is observational and may be up to preference (or local needs), but for whom a tool is developed is another. One highlight writes: "Tool developers should clarify the use case, e.g. beginner or advanced assessments". In retrospect, this could have been reworded slightly: "Tool developers should clarify what prior knowledge is needed to use the tool." But those are probably just the two sides of the same coin.

No comments:

Post a Comment