Friday, April 29, 2016

Sci-Hub succeeds where publishers fail (open and closed)

Sci-Hub use in The Netherlands is not limited to
the academic research cities. Harlingen is a small
harbor town where at best a doctor lives and
one or two students who visit parents in the
weekend. The nature of the top downloaded
paper suggests it is not a doctor :)
Data from Bohannon and Elbakyan.
Knowledge dissemination is a thing. It's not easy. In fact, it's a major challenge. Traditional routes are not efficient anymore, where they were 200 years ago. The world has moved on; the publishing industry has not. I have written plenty in this blog about how the publishers could catch up, and while this is happening, progress is (too) slow.

The changes are not only technical, but also social. Several publishers still believe we live in a industrial area, where the world has moved on into a knowledge era. More people are mining and servicing data than there are making physical things (think about that!). Access to knowledge matters, and dealing with data and knowledge stopped being something specific for academic and other research institutes many, many years ago. Arguments that knowledge is only for the highly educated is simply contradicting and bluntly ignore our modern civilization.

This makes access to knowledge a mix of technological and social evolution, and on both end many publishers fail, fail hard, fail repeatedly. I would even argue that all the new publishers are improving things, but are failing to really innovate in knowledge dissemination. And not just the publishing industry, also many scientists. Preprint servers are helpful, but this is really not the end goal. If you really care about speeding up knowledge dissemination, stop worrying about things like text mining, preprints, but you have to start making knowledge machine readable (sorry, scientist) and release that along or before your article. Yes, that is harder, but just realize you are getting well-paid for doing your job.

So, by no means the success of Sci-Hub is unexpected. It is not really the end goal I have in mind, and in many ways contradicting what I want. But the research community thinks differently, clearly. Oh wait, not just the research community, but the current civilization. The results of the Bohannon analysis of the Sci-Hub access logs I just linked to clearly shows this. There are so many aspects, and so many interpretations and remaining questions. The article rightfully asks, is it need or convenience. I argued recently the latter is likely an important reason at western universities, and that it is nothing new.

This article is a must read if you care about the future of civilization. Bonus points for a citable data set!

Bohannon, J. Who's downloading pirated papers? everyone. Science 352, 508-512 (2016). URL
Elbakyan, A. & Bohannon, J. Data from: Who's downloading pirated papers? everyone. (2016). URL

Sunday, April 24, 2016

Programming in the Life Sciences #22: jsFiddle

My son pointed me to jsFiddle which allows you to edit JavaScript snippets and run them. I have heard of them before, but never really got time for it. But I'm genuinely impressed with the stuff he is doing, and finally wanted to try sharing JavaScript snippets online, particularly, because I had to update the course description of Programming in the Life Sciences. In this course the students work with JavaScript and there are a number of example, but that has a lot of HTML boiler plate code.

So, here's the first of those examples, but then stripped from most of the things you don't need, and with some extra documentation as comments:

Saturday, April 23, 2016

Splitting up Bioclipse Groovy scripts

Source: Wikipedia, CC-BY-SA 3.0
... without writing additional script managers (see doi:10.1186/1471-2105-10-397). That was what I was after. I found that by using evaluate() you could load additional code. Only requirements, you wrap stuff in a class, and the filename need to match the class name. So, you put stuff in a class SomeName and safe that in a Bioclipse project (e.g. SomeProject/) with the name SomeName.groovy.

That is, I have this set up:


Then, in this aScript.groovy you can include the following code to load that class and make use of the content:

  someClass = evaluate(
    new File(

Maybe there are even better ways, but this works for me. I tried the regular Groovy way of instantiating a class defined like this, but because the Bioclipse Groovy environment does not have a working directory, I could not get that to work.

Tuesday, April 05, 2016

Still a draft: The Amsterdam Call for Action on Open Science

It was on the agenda: "Presenting the Amsterdam Call for Action". However, a day of hard work by some 300 participants of the Dutch Presidency's meeting on Open Science did not allow for the draft to be finalized today. Instead, the editors will work the next 24h (a bit less by now) on a new draft that will be send around to the participants which will then have about a week to send in further comments.

There was enough feedback given on the draft indeed, and followers of my blog and twitter account know how much they already got from just me. It will be a busy 24 hours for the editors. I am really looking forward with the next draft they come up with. BTW, it is not clear yet if I will be able to share the draft that I get tomorrow. We'll see. At least I will tweet about whether or not my main points got addressed.

Meanwhile, the Dutch VSNU sent out a press release that "[they are] pleased with European action plan Open Science". Given that it was in fact not released yet, suggests a few things:

  1. they anticipated the draft was a done deal (which aligns with the lack of openness around the draft);
  2. automated sending out of press releases is a bad idea.

The Amsterdam Call for Action on Open Science #2: 10% Open Science

Some of the discussions here are about data sensitivity, privacy, etc. Excellent points! One confusion that should be put aside is that you cannot be an open scientist if you do not release everything. That is nonsense, FUD perhaps.

The The Amsterdam Call for Action on Open Science may very well set some goals; I do not think it currently does. It may set a goal of 10% by the end of this year (already made, probably), 20% at the end of 2018, and 30% at the end of H2020. You can read this in many ways, and here too, things are very vague.

After yesterday I herald this approach! Would it not be brilliant if at the end of this year all scholars output 10% of the research as Open Science?! Go for it!

The Amsterdam Call for Action on Open Science

First, it is very regrettable the participants of #EU2016NL #openaccess did not get access to The Amsterdam Call for Action on Open Science document (read it in this Trello) until minutes before the break out sessions. So, I am still reading it (it's 21 pages), while in the Innovation session, which will only discuss actions #8 and #9.

Here's the first page:

All participants split up in various break out sessions, and I ended up in the Innovation session. However, I find the description of Open Science insufficient for me to understand the proposed twelve actions. It does not define the core values of Open Science:

You can see my comments already. And really, this is critical: all the actions make assumptions, do not define things, which causes the problem no nothing is actionable, because you can basically do anything and still comply to the action. (In fact, another issue is that several actions are already being undertaken, but that's for later).

My recommendation is to rephrase the first sentence into:

"Open science is an umbrella term for a technology and data driven systemic change in how knowledge dissemination works and how researchers work, collaborate, share ideas and disseminate results, by adopting the core values that knowledge should be reusable, modifiable and redistributable. This allows us address the increasing demand in society to address societal challenges of our time."

These are the cover values implemented by Open Data and Open Source. Sadly, not commonly by Open Access, causing a lot of confusion in the latter area, which have been very clear at this meeting too.