Saturday, August 30, 2014

On Open Access in The Netherlands

Yesterday, I received a letter from the Association of Universities The Netherlands (VSNU, @deVSNU) about Open Access. The Netherlands is for research a very interesting country: it's small, meaning we have few resources to establish and maintain high profile centers, we also believe strong education benefits from distribution, so we we have many good universities, rather than a few excelling universities. Mind you, this clouds that we absolutely do have excelling research institutes and research groups; they just are not concentrated in one university.

Another important aspect is that all those Dutch universities are expected to compete which each other for funding. As a result I have experience rather interesting collaborations between universities. That's a downside of a small country: everyone knows each other, often in way to much detail. But my point is that the Dutch can be rather conservative. That kills innovation, and is in my opinion a key reason why we are not breaking into the top 50 of rankings, more than concentration. Concentration of funding in Top research institutes has not been extensively evaluated, but I think the efficiency is not proven higher than previous funding approaches.

Anyway, this letter I received is part of their Open Access program. Here too, the Dutch universities are conservative (well, relatively from my views, at least). Now, the Open Access debate is not so interesting, because it primarily ends up about who pays who (boring) and whether we should go gold or green (besides the point, see below), and, sadly, here too many people think about who pays who again (still boring).

Therefore, giving the outlined importance and impact of Dutch research, I found it relevant to post about the progress of Open Access in my small country. The letter is available in English.

Basically, the letter is an answer to an earlier letter from our government about Open Access, and it warns about actions that will soon be undertaken (so, not really pro-active). However,
    "[they] are also appealing to you to continue to advocate free access to your own scientific publications."
Well, I have, not so actively, and maybe this post can be the start of a change. Because what basically bothers me is that the Open Access discussion, also in The Netherlands, is biased. And indeed, the letter continues with a section about gold and green access. If the VSNU really wants to promote free access to research, it should not even accept green. We all know that it is not about being able to look at (free), but to be able to mix and improve. Reuse. Continue. Stand on shoulders. The fact that this letter focuses on publications only, does not spend a word on reuse, is rather depressing and not giving me even the slightest hint that The Netherlands will break into that Top 50 any time soon.

Overall, the latter is relatively positive for the Open Access movement, though reactive. They still have some explanation to do:
    "The golden route is more complex. However, many believe that in the end it is a
    more sustainable route to Open Access."
(Or maybe readers can explain me what is complex about the golden route?)

The following is a rather interesting section, but really only when they had focused on Open Access in its pure form that allows research reuse. I think it now leaves you with a low starting point bargaining with resistant publisher lawyers and managers that have long lost the interest of the academics in favor of that of the share holders:
    For the past ten years, publishers have been offering journals in package deals referred to as Big Deals. Shortly negotiations with the major publishers about these Big Deals Will take place, including Elsevier, Springer and Wiley. The Dutch universities have expressed their wish to make agreements with these publishers about the transition to Open Access as part of those Big Deals. Universities expect publishers to take serious steps to facilitate that transition.
I hope the VSNU will clarify with what they mean with "serious". Because they all came up with "me too" solutions (setting up new OA journals) without seriously changing their model. No large publisher dared making the flagship journals full gold Open Access. That is serious business; all we see now is scribbling in the margin.

Perhaps that is the reason of the wish to be in the top 50. Maybe the VSNU just wants a better bargaining position.

The letter ends with what researchers can do. And with that, they are spot on:
    As a researcher, you can play a vital role in the transition to Open Access. We have 
    mentioned the possibility of depositing arlídes in the repository of your own
    university. But there is more. It’s important to consider that researchers play a key 
    role in the publishing process: as providers of the scientific content, as reviewers 
    and as members of editorial and advisory boards. We hope that where ever possible, 
    you will ask publishers to convert to an Open Access model.
What any researcher can already do to promote (proper) Open Access:

  1. stop reviewing publishing closed-access papers (you have way too much review requests already, and some filtering will not hurt you)
  2. stop reviewing publishing for non-gold Open Access journals (step further than the first item)
  3. submit only to full-gold Open Access journals (plenty of options; importantly, the quality and impact of your paper is not dependent on the journal, but on you. if not, you're just a bad author and researcher and should go back to school or start learning from feed back on your Open Notebook Science, so that you improve your act before you submit; really, it happens to the best of us: multidisciplinary research is hard: you cannot excel in biology and chemistry and statistics and informatics and computer science and data analysis and materials science and as perfect and creative linguistic (well, not all of us, anyway))
  4. put your previous mistakenly closed-access papers in university repositories (most Dutch universities have solutions; not all yet)
  5. make previously published closed-access papers gold Open Access (yes, you can! I am in the process of doing this for the CDK I paper, and other ACS papers will follow)
  6. get an ORCID
  7. use #altmetrics to see that gold Open Access gives you more impact for your papers too (service providers include ImpactStory,, Plum Analytics, etc)
Of course, it is not only about publications. Again, the VSNU would do good to learn that research is not the same as publications. Besides sending letters, I think the VSNU can do this to promote Open Science, which is what I hope they are after:
  1. negotiate with the government and major science and funding agencies (KNAW, NWO) to stop focusing on publications as primary output
  2. start focusing on output other than publications (e.g. data sets, software) even if you have not ended negotiations with other, just to set a proper example
  3. make research outcomes machine readable (read this interesting post from our national library)
  4. actively explore business models around Open Science (and not have your universities' spin-off departments only know about patent law, ignore the rest of the world)
  5. adopt the ORCID nation wide, staring Jan 2015
  6. start using #altmetrics to get a better perspective of the performance of your members
Of course, I am more than willing to help the VNSU with this transition. I can be reached at the Department of Bioinformatics - BiGCaT, NUTRIM, FHML, Maastricht University. There are many options I have missed here (like data repositories, data citing, DOIs, and whatever).

PS. my ImpactStory profile will tell you that more than 80% of my publications are Open Access. Not all gold yet, but I am working on changing that for some old papers.

Tuesday, July 22, 2014

Open Notebook Science ONSSP #1:

As promised, I slowly set out to explore ONSSPs (Open Notebook Science Service Providers). I do not have a full overview of solutions yet but found LabTrove and Open Notebook Science Network. The latter is a more clear ONSSP while the first seems to be the software.

So, my first experiment is with Open Notebook Science Network (ONSN). The platform uses WordPress, a proven technology. I am not a huge fan of the set up which has a lot of features making it sometimes hard to find what you need. Indeed, my first write up ended up as a Page rather than a Post. On the upside, there is a huge community around it, with experts in every city (literally!). But my ONS is now online and you can monitor my Open research with this RSS feed.

One of the downsides is that the editor is not oriented at structured data, though there is a feature for Forms which I may need to explore later. My first experiment was a quick, small hack: upgrade Bioclipse with OPSIN 1.6. As discussed in my #jcbms talk, I think it may be good for cheminformatics if we really start writing up step-by-step descriptions of common tasks.

My first observations are that it is an easy platform to work with. Embedding images is easy, and there should be option for chemistry extensions. For example, there is a Jmol plugin for WordPress, there are plugins for Semantic Web support (no clue which one I would recommend), an extensions for bibliographies are available too, if not mistaken. And, we also already see my ORCID prominently listed, and I am not sure if I did this, or whether this the ONSN people added this as a default feature.

Even better is the GitHub support @ONScience made me aware of, by @benbalter. The instructions were not crystal clear to me (see issues #25 and #26), some suggested fixes (pull request #27), it started working, and I now have a backup of my ONS at GitHub!

So, it looks like I am going to play with this ONSSP a lot more.

Friday, July 18, 2014

Open Notebook Science: also for cheminformatics

Last Monday the Jean-Claude Bradley Memorial Symposium was held in Cambridge (slide decks). Jean-Claude was a remarkable man and I spoke at the meeting on several things and also how he made me jealous with his Open Notebook Science work. I had the pleasure to work with him on a RDF representation of solubility data.

It took me a long time to group my thoughts and write the abstract I submitted to the meeting:
    I always believed that with Open Data, Open Source, and Open Standards I was doing the right thing; that it was enough for a better science. However, I have come to the realization that these features are not enough. Surely, they aid Open collaborations, though not even sufficient there, but they fail horribly in the "scientific method." Because while ODOSOS makes work reproducible, it lacks the context needed by scholars to understand what it solved. That is, it details out in much detail how some scientific question is answered, but not what question that was. As such, it fails to follow the established practices in scholarly research. In this presentation I will show how I should have done some of my research, and ponder on reasons why I had not done so.
And it also took me a long time and a lot of stress to get together some slides, but I managed in the end:

During the talk I promised to start doing Open Notebook Science (ONS) for my research, and I am currently exploring ONS platforms.

The meeting itself was great. There was a group of about 40 people in Cambridge and another 15 online, and most of them into Open Science or at least wanting to learn what it is about. I met old friends and new people, including a just-graduated Maastricht Science Programme student (one that I did not have in my class last year). Coverage on Twitter was pretty good (using the #jcbms hashtag, an archive) with some 90 people using the hashtag.
Several initiatives seem to be evolving, including an ONS initiative and a memorial special issue. All these will need to help from the community. The time is right.

Sunday, July 06, 2014

#JChemInf Volume 5 as PDF on @FigShare

One of the things I do to prepare for holiday, is get some reading stuff together. I haven't finished Gödel, Escher, Bach yet (a suggested from the blogosphere), with a bit of luck there are new chapters of HPMOR, and I normally try to catch up with literature. One advantage of Open Access is that you can remix. So, I created a single PDF of all JChemInf Vol. 5 articles (last year I did volumes 1, 2, 3, and 4). This PDF is about 75 MB in size, and therefore fits on most smartphones. The PDF has an index, but doesn't have entries for each paper, but jumping from abstract to abstract works fine. It has a bit over fifty peer-reviewed papers.

Another advantage of Open Access is that you can reshare. And so I did, and the volumes are available from FigShare:
  1. JChemInf Vol.1
  2. JChemInf Vol.2
  3. JChemInf Vol.3
  4. JChemInf Vol.4
  5. JChemInf Vol.5
Of course, a clear downside it, is that it interferes with #altmetrics. And, I am wondering if a similar thing can be done with ePubs.

Saturday, July 05, 2014

Journal Open Data Guidelines: plenty of room for clarifications

J. Gray, Wikipedia. CCZero.
Several journals are playing with statements about Open Data, and, for example, F1000Research and require Open Data. When publishers are judged in their implementation on Open Access, so should we critically analyze journals that claim to be an Open Data journal. Well, such claims I have not seen, but some journals have promising statements, like:
BioMed Central
    Data associated with the article are available under the terms of the CCZero.
However, this claim is vague, or, at least, too vague for a paper I am currently reviewing. The fuzziness lies in the word "associated". What defines associated data? How does this relate to reproducibility? If the purpose of Open Data is that the results of the paper can be reproduced, it means all data? And what happens if some of the data is from a previous paper? Or from a proprietary database? Is a paper that has data from proprietary database as key steps in the argumentation acceptable to a data that demands Open associated Data? What if the authors do not have control over the the license? Or is it limited to new data? But what defines new data here? Because it is a really hard question in an era where data has very limited provenance (versioning, author attribution, etc).