Sunday, January 13, 2019

cOAlition S is requesting feedback

Plan S (Wikipedia, Scholia) is here to stay. The pros and cons are still being explored, but with the list of participants and endorsements still growing (the latest endorsement comes from the African Academy of Sciences), it seems unlikely the ten principles will be disregarded. The implementation, however, may still change. I'm still looking forward to hearing more about the alleged improper lobbying and CoI of Frontiers. I have some thoughts and observations about Frontiers myself.

Meanwhile, one aspect Plan S highlights are the huge differences in politics between European countries. An ongoing debate in Germany about whether scholars have a legal unlimited freedom to publish where they want (ethical or not) is yet unsettled, I think. And Norway did not seem to have discussed Open Access publishing much yet, and it has been suggested that the sudden introduction of Plan S may be unlawful.

The situation in The Netherlands regarding the latter is different, in that Plan S here naturally follows from a direction chosen by the Dutch government some years ago. This resulted in a formal advice of the Dutch Presidency, the Amsterdam Call for Action on Open Science (2016) and governmental policies based on that. During the Presidency a formal meeting was organized in, no surprise, Amsterdam, where a draft was presented and formally discussed with stakeholders. Individual researchers had been invited, and with some luck my reply to the invitation was accepted and I joined the meeting.

My main comment on the draft Call.
I cannot say the meeting left a lot of room for improvement, and the draft was shared with participants only very shortly before the meeting. I stressed the importance of three core (user) rights of Open Science: reuse, modify, redistribute, but while some other points were picked up (I don't think the organizers had a lot of room, as it would be signed on the spot and presented to the European Union), this point did not get picked up.

Now, Plan S also fails to mention these rights, which I consider a serious flaw. Instead, they choose to focus on a specific implementation of those three rights. This is counter normal procedures in European politics or at least Dutch European politics, where things are generally kept vague to be refined later.

Of course, the discussions on Open Science did not start early 2016. Dutch politics is not that fast. One aspect of the Dutch discussion has been that the focus has been too much on the cost of open access publishing, and this leaks into Plan S, is my impression. But Dutch research institutes (particularly via their libraries) have brought up the unsustainable situation of journal subscriptions for quite a bit longer. I seem to remember the discussion of big package deals when I was a student in the nineties, a time where individual researchers and researchers still have "personal" subscriptions. I used to read JCIM (JCICS at the time) at the CAOS/CAMM (see doi:10.1007/978-3-642-74373-3_51). Yes, the need for this reform has been discussed for at least 20 years in The Netherlands.

For me, as a Dutch researcher, Plan S is not radical nor a surprise (*): it is a natural consequence of publishers resisting the needed reforms that have been started years ago, and upon which subscription deals between the Dutch universities and publishers have been based. With less than two years to go, the ambition set out by the Dutch government to be 100% Open Access by 2020 was and is far away (unless there is some radical change to an exponential increase in these last months). So, if the Dutch government wants to keep its political promise, a radical change was needed. The only surprise (hence the *) is, perhaps, that they wanted to keep their promise.

Since the Amsterdam Call, the Dutch government further involved Dutch researchers, via the National Platform Open Science (NPOS), where various researcher organizations are actively involved (postdoc network, VSNU, etc, etc). NPOS has been underfunded and the involvement of researchers could have been a lot better.

It must also be mentioned that Plan S, as far as I know, has not been discussed at this level of NPOS. I expect it did get discussed by the participating NPOS partners (which included NWO). This is not surprising to me either, though I would very much appreciate a weaker hierarchy. But that hierarchy is very Dutch, and even researchers indicate to not have time for all those discussion, so things are self-organized in representing organizations: it seems the general Dutch consensus was (mind you, I'm an ECR, I did not design this, and this approach is not uncontroversial, as #WOinActie makes clear) that representation is the best way forward. And as far as I can see, this is how Plan S came about. But the discussions around Plan S make clear that a lot of researchers feel left out. Understandably, but that is the Dutch academic culture to blame, not the Dutch funders: individual researchers rarely get asked for feedback on national guidance/policy documents. At the same time, that does not invalidate their concerns either, of course.

So, while the above may not have said it, I like what cOAlition is attempting to do: the publishing system is breaking down and must be fixed and only few publishers are making a serious effort (at a time where some publishers make huge profits). The discussion has been nasty on both sides. Insinuations that gold Open Access journals are not interested in quality are hurtful (remember, I'm editor-in-chief of such a journal, and I work overtime to ensure the highest standards for our articles, but know the limitations of publication platforms (see this post) and peer review, despite the journal having access to very qualified researcher pool).

Being an academic, you need holiday to sit down and do something you care about but that is not paid for. For me, commenting on Plan S, or contributing as critical observer to the NPOS, is one of that. Even finding time for that interview about Plan S in ScienceGuide has been hard.

But now that the deadline for the call by cOAlition S for feedback is nearing, it's time to get my points written up. I decided to use for this (yes, I have not used it enough for someone who joined it in 2012...):

I started with the Too Risky? Open Letter, followed with commenting on Plan S. The fact that I do not like at all how the Open Letter was formulated (see my comments; the letter has a number of fallacies), does not mean I like Plan S as it is (which some seem to assume). These annotations looks like this and can be read with and without a browser plugin:

There is plenty more to annotate, including the letter of support of the principles (see doi:10.1038/d41586-018-07632-2). I signed this letter: while I do not agree with the wording of all principles, putting it into context of the Dutch situation, their intentions make a lot of sense. But I have to say, I may have been rushed into signing it, with the Too Risky? using various fallacies and suggesting it represents researchers in general (the letter does not say that literally, neither do they make it clear to just speak on behalf of the signers, causing a lot of press media to misrepresent the letter). The letter sketches a doom scenario and is worded what the open source community would refer to as FUD: fear, uncertainty, and doubt.

Effectively what happened is that this Too Risky? letter has caused (for me), is how hard some serious effort is needed to make the required changes. Every change has consequences. One team focuses on the positive consequences, another team focuses on the negative consequences.

I do not know what Plan S will bring us. My name is not Nostradamus (or this Dutch reference). I do know the risks of the current system. Those are easily named and have been apparent for at least two decades. Too Risky? The current system has already done a lot of harm and is proven risky. I am happy that Dutch funders dare to invest in the future. Plan S may not be the right option, and I am looking forward to alternative solutions that ensure the three rights of Open Science (reuse, modify, redistribute). People have freedom to choice if they want to practice Open Science or not. That freedom is, in The Netherlands in recent times, limited by several things, but the limitation that the Dutch national funders wants research to benefit Dutch society is in line with Dutch political climate of the past decade (think Nationale Wetenschapsagenda).

Monday, December 31, 2018

Wikidata-Taxonomy: class and instance hierarchies on the command line

For some time I had a 2017 Tweet from Dan Brickley on my todo list (I use Todoist), and now that it is holiday, I finally had time to play with Wikidata-Taxonomy. Here's it in action for a class of five phytocassanes:

$ node wdtaxonomy.js  Q60224961 -i

Give it a try.

Friday, December 28, 2018

Replacing BibTeX with Citation.js

As part of replacing LaTeX with Markdown for my Groovy Cheminformatics book (now Open Access), I also needed to replace BibTex. Fortunately, Citation.js supports Wikidata and the solution by Lars was simpler than I hoped. Similar to LaTeX, I have citations annotated in the Markdown, but the reference code does not refer to a BibTeX file entry, but to Wikidata (see also Wikidata-powered citation lists with citation.js).

The set up is as follows:
  1. extract the Wikidata Q-codes (which creates references.qids)
  2. using Citation.js to format the reference as plain text
  3. number of the citations and create the bibliography
The first step uses a Groovy script, and the second a very short JavaScript script:

fs.readFile('references.qids', 'utf8',
            async function (err, file) {
  const data = Array.from(await Cite.async(file)).map(
    item => + '=' + Cite(item).format(
      'bibliography', {template: 'vancouver'}
  fs.writeFile('references.dat', data.join(''),
    function() {}

The result looks like:

I have yet some things left to do, like add the DOI, and add some Markdown formatting. But the toolkit allows that but also is not urgent.

Thursday, December 27, 2018

Creating nanopublications with Groovy

Compound found in Taphrorychus bicolor
Published in Liebigs Annalen, see
this post about the history of that journal.
Yesterday I struggled some with creating nanopublications with Groovy. My first attempt was an utter failure, but then I discovered Thomas Kuhn's NanopubCreator and it was downhill from there.

There are two good things about this. First, I now have a code base that I can easily repurpose to make trusty nanopublications (doi:10.1007/978-3-319-07443-6_63) about anything structured as a table (so can you).

Second, I now about almost 1200 CCZero nanopublications that tell you in which species a certain metabolite has been found. Sourced from Wikidata, using their SPARQL end point. This collection is a bit boring that this moment, and most of them are human metabolites, where the source is either Recon 2.2 or WikiPathways. But I expect (hope) to see more DOIs to show up. Think We challenge you to reuse Additional Files.

Finally, you are probably interested in learning what one of the created nanopublications looks like, to I put a Gist online:

Wednesday, December 26, 2018

Groovy Cheminformatics rises from the ashes

Cover of the last print
version of the book.
Like a phoenix (Phenix aegyptus), my Groovy Cheminformatics rises from the ashes. About a year ago I blogged that I could not longer maintain my book, not in the print form. The hardest part was actually resizing the cover each time the book got thicker. I actually started the book about 10 years ago, but the wish to make it Open Access grew bigger with the years.

So, here we go. It's based on CDK 2.0, but somewhere in the coming weeks I'll migrate to the latest version. It will take some weeks to migrate all content, and your chapter priority requests here.

The making of...
Over the past months I have been playing with some ideas on how to make the transition. I wanted to preserve the core concept of the book that all books are compiled and executed which each release and that all output of scripts is autogenerated (including many of the diagrams). I wanted to publish the next iteration of the book as Markdown, but also pondered with the idea of still being able to generate a PDF with LaTeX. That means I have a lot of stuff to upgrade.

I ended up somewhere in between. It's source is Markdown, but not entirely. It's source code that looks like Markdown with snippets of XML. This makes sure the source looks formatted when on GitHub:
But you can see that this is not processed yet. The CreateAtom1 and CreateAtom2 refers to code examples, and the above screenshot shows the source of a source code inclusion (for CreateAtom1 and CreateAtom2) and a output inclusion (for CreateAtom2). After processing, the actual page looks like this:

That looks pretty close to what the print book had. An extra here is that you can click (hard in a print book) the link to the code. That is something I improved on along the way, and leads to a Markdown (new) page that shows the full sources and the output (should I add the @Grab instructions, or too obvious?):

If you check the first online version (🎶 On the first day of xmas, #openscience got from me ... 🎶), I have quite some content to migrate. First, back to doing the reference sections properly, as if I was still working with BibLaTeX.

Happy holidays!