Saturday, February 09, 2019

Comparing Research Journals Quality #1: FAIRness of journal articles

What a traditional research article
looks like. Nice layout, hard to
reuse the knowledge from.
Image: CC BY-SA 4.0.
After Plan S was proposed, there finally was a community-wide discussion on the future of publishing. Not everyone is clearly speaking out if they want open access or not, but there's a start for more. Plan S aims to reform the current model. (Interestingly, the argument that not a lot of journals are currently "compliant" is sort of the point of the Plan.) One thing it does not want to reform, is the quality of the good journals (at least, I have not seen that as one of the principles). There are many aspects to the quality of a research journal. There are also many things that disguise themselves as aspects of quality but are not. This series discusses quality of a journal. We skip the trivial ones, like peer review, for now, because I honestly do not believe that the cOAlition S funders want worse peer review.

We start with FAIRness (doi:10.1038/sdata.2016.18). This falls, if you like, under the category of added value. FAIRness does not change the validness of the conclusions of an article, it just improves the rigor of the knowledge dissemination. To me, a quality journal is one that takes knowledge dissemination seriously. All journals have a heritage of being printed on paper, and most journals have been very slows in adopting innovative approaches. So, let's put down some requirements of the journal of 2020.

First the about the article itself:

About findable

  • uses identifiers (DOI) at least at article level, but possibly also for figures and supplementary information
  • provides data of an article (including citations)
  • data is actively distributed (PubMed, Scopus, OpenCitations, etc)
  • maximizes findability by supporting probably more than one open standard
About accessible
  • data can be accessed using open standards (HTTP, etc)
  • data is archived (possibly replicated by others, like libraries)
About interoperable
  • data is using open standards (RDF, XML, etc)
  • data uses open ontologies (many open standards exist, see this preprint)
  • uses linked data approaches (e.g. for citations)
About reusable
  • data is as complete as possible
  • data is available under an Open Science compliant license
  • data is uses modern and used community standards
Pretty straightforward. For author, title, journal, name, year, etc, most journals apply this. Of course, bigger publishers that invested in these aspects many moons ago can be compliant much easier, because they already were.

Second, what about the content of the article? There we start seeing huge differences.

About findable
  • important concepts in the article are easily identified (e.g. with markup)
  • important concepts use (compact) identifiers
Here, the important concepts are entities like cities, genes, metabolites, species, etc, etc. But also reference data sets, software, cited articles, etc. Some journals only use keywords, some journals have policies about use of identifiers for genes and proteins. Using identifiers for data and software is rare, sadly.

About accessible
  • articles can be retrieved by concept identifiers (via open, free standards)
  • article-concept identifier links are archived
  • table and figure data is annotated with concept identifiers
  • table and figure data can be accessed in an automated way
Here we see a clear problem. Publishers have been actively fighting this for years, even to today. Text miners and projects like Europe PMC are stepping in, but severely hampered by copyright law and publishers not wishing to make exception.

About interoperable
  • concept are describes common standards (many available)
  • table and figure data is available as something like CSV, RDF
Currently, the only serious standard used by the majority of (STM?) journals are MeSH terms for keywords and perhaps CrossRef XML for citations. Table and figures are more than just a graphical representations. Some journals are experimenting with this.

About reusable
  • the content of the article has a clear licence, Open Science compliant
  • the content is available with relevant standards of now
This is hard. These community standards are a moving target. For example, how we name concepts changes over time. But also identifiers themselves change over time. But a journal can be specific and accurate, which ensures that even 50 years from now, the context of the content can be determined. Of course, with proper Open Science approaches, translation to then modern community standards is simplified.

There are tons of references I can give here. If you really like these ideas, I recommend:
  1. continue reading my blog with many, many pointers
  2. read (and maybe sign) our Open Science Feedback to the Guidance on the Implementation of Plan S (doi:10.5281/zenodo.2560200), where many of these ideas are part of

Tuesday, February 05, 2019

Plan S: Less publications, but more quality, more reusable? Yes, please.

If you look at opinions published in scholarly journals (RSS feed, if you like to keep up), then Plan S is all 'bout the money (as Meja already tried to warn us):

No one wants puppies to die. Similarly, no one wants journals to die. But maybe we should. Well, the journals, not the puppies. I don't know, but it does make sense to me (at this very moment):

The past few decades has seen a significant growth of journals. And before hybrid journals were introduced, publishers tended to start new journals, rather than make journals Open Access. At the same time, the number of articles too has gone up significantly. In fact, the flood of literature is drowning researchers and this problem has been discussed for years. But if we have too much literature, should we not aim for less literature? And do it better instead?

Over the past 13 years I have blogged on many occasions about how we can make journals more reusable. And many open scientist can quote you Linus: "given enough eyeballs, all bugs are shallow". In fact, just worded differently, any researcher will tell you exactly the same, which is why we do peer review.
But the problem here is the first two words: given enough.

What if we just started publishing half of what we do now? If we have an APC-business model, we have immediately halved(!) the publishing cost. We also save ourselves from a lot of peer-review work, reading of marginal articles.

And what if we just the time we freed up for actually making knowledge dissemination better? Make journals articles actually machine readable, put some RDF in them? What if we could reuse supplementary information. What if we could ask our smartphone to compare the claims of one article with that of another, just like we compare two smartphones. Oh, they have more data, but theirs has a smaller error margin. Oh, they tried it at that temperature, which seems to work better than in that other paper.

I have blogged about this topic for more than a decade now. I don't want to wait another 15 years for journal publications to evolve. I want some serious activity. I want Open Science in our Open Access.

This is one of my personal motives to our Open Science Feedback to cOAlition S, and I am happy that 40 people joined in the past 36 hours, from 12 countries. Please have a read, and please share it with others. Let your social network know why the current publishing system needs serious improvement and that Open Science has had the answer for years now.

Help our push and show your support to cOAlition S to trigger exactly this push for better scholarly publishing:

Sunday, February 03, 2019

Plan S and the Open Science Community

Plan S is about Open Access. But Open Science is so much more and includes other aspects, like Open Data, Open Source, Open Standards. But like Publications have hijacked knowledge dissemination (think research assessment), we risk that Open Access is hijacking the Open Science ambition. If you find Open Science more important than Open Access, then this is for you.

cOAlition S is asking for feedback, and because I think Open Science is so much more, I want the Guidance on the Implementation of Plan S to have more attention for Open Science. I am submitting on Wednesday this Open Science Feedback on the Guidance on the Implementation of Plan S outlining 10 points how it can be improved to support Open Science better.

Please read the feedback document and if you agree, please join Jon Tennant and co-sign it using this form: