Monday, October 14, 2019

ChemCuration 2019 Poster Conference: Call for Posters

Twitter profile.
It giet oan! That it a Frisian phrase for something unlike is going to happen, like and particularly related to the Elfstedentocht.

ChemCuration 2019 is a go. The website is online, the Twitter account and hashtag are ready, we got a poster prize, and here is the call for posters!

    On December 3 the first ChemCuration conference will take place. ChemCuration 2019 is a one day, online-only conference around data curation and curated data in the chemistry domain. During the entire conference day, you can participate by tweeting about the poster that you uploaded, along with the meeting hashtag, and responding to questions about your poster in the 24 hours of the conference day. The poster must be available in an online repository (e.g. Zenodo or Figshare) under the CCZero, CC-BY or CC-BY-SA license prior to the conference.

    This is the meeting scope: anything around data curation and curated data of open science data in chemistry. This includes but is not limited to: 1. a new release of curated open data; 2. FAIR metadata around open data; and 3. open source tools for data curation.

    How do I participate in ChemCuration?
    You can participate in this online poster conference by presenting your poster on Twitter
    during the conference day. You do this by first archiving your poster via Figshare or Zenodo,
    with an open license (e.g. CCZero or CC-BY). Then, during the day you tweet an image of
    (part of) your digital poster with the #chemcur2019 hashtag, a short summary, and a link to
    your online poster with its DOI. The archived poster should be a regular A0 poster (WxH =
    841 x 1189 mm or 33.1 x 46.8 in)

    Do I need to register?
    Registration is not obligatory to participate. However, if you would like to be eligible for a poster prize, then registration is required, by Nov. 30th, 2019. The registration form is found at

    More information can be found on the website ( and on Twitter

Wednesday, October 09, 2019

ChemCuration: a small trick to fix the SMILES of glucuronides

Glucuronide functional group.
Now that the ChemCuration 2019 online poster conference is nearing, and my upcoming talks about chemistry in Wikidata (also needing curation), and the much longer process of curation of metabolite (-like) structures in WikiPathways, I decided that something I tweeted earlier this week is actually quite useful, and therefore something I should really write up in my lab notebook.

Glucuronide is an example (biological) functional group. And there are several databases that represent the stereochemistry now always correct. That is an interoperability (and thus FAIR) problem. Correcting this is not trivial, particularly if you have to redraw the same glucuronide group again and again.

So, not looking forward to that, I invested a bit of time to find a SMILES trick. What if I had a SMILES snippet that I could easily copy/paste and attach to the SMILES of the chemical structure it is attached to? Here goes.


I just realized that the original 3 I used can better be a 9, which is less likely to occur in the SMILES of the rest of the molecule. The period at the end is also deliberate. That way, I can just copy past the SMILES of the rest directly after that period. Then I get a disconnected structure, but I only have to put a 9 next to the atom that is binding to the glucuronide. So, let's see the R group is methane, I get:


Now, next stop: CoA and other common biological tags.

Sunday, September 29, 2019

Newspaper "De Volkskrant" trashtalks the Dutch research funder NWO

Headline: "This astronomer took a photo
of a black hole, got famous, and has no
research funding now.
Update: there are two articles, one a news item and the other the interview. I uphold my comments here. They are clearly co-published and should be seen as one event (another intended pun).

I find it very disturbing what role newspapers have in the selection of research to fund. Of course, most of this is indirectly, but claiming that a newspapers decides what is fundable and what is not, is crossing a line. It all started with this interviewWaarom de Nijmeegse astronoom die een foto maakte van een zwart gat nu zonder geld voor onderzoek zit, or 
Deze astronoom maakte een foto van een zwart gat, werd wereldberoemd en zit nu zonder onderzoeksgeld. Interestingly, headlines change arbitrarily and De Volkskrant frequently changes headlines for, I can only assume at this moment, clickbait purposes. This article at least had two headlines, something I will come back to later.

At first, I ignored the article. It is just an interview and the science corner of De Volkskrant frequently has opinions, columns, etc, and the amount of science news is too low anyway (IMHO).

But then there was apparently an uproar on Twitter, which I had missed, by a tweet from Daniël Lakens caught my eye (update: this tweet replies to a tweet about the interview, not the news item; despite both are published in parallel, closely linked, I disagree with the argument that the two are not connected because they are different genre):

As said, it's not a news item, but an interview. By putting it is new Science corner of the newspaper, it is upgraded to news. And Daniël is quite right: not getting a grant is not news. And then we get to the git of the uproar: why is this rejection so special that it deserves to be so prominently published in a national newspaper. Apparently, the research is too big to fail. The change of success too high. The research internationally too glossy to not get funded.

And that puts De Volkskrant in a new role: not covering news, but lobbying for a certain Dutch research agenda. Of course, it's a public secret that this is how it works, but you can be honest about it. Falcko's research is world famous because the newspaper made it.

Now, the reason why this prominent place of this interview is disturbing is a bit complex. The issue at hand has many angles, and the aforementioned positioning of an interview as news is one of them. I will try to cover a few more of them.

The grant model
First, Falcko is not the first to see "excellent" (sic, see this) proposals rejected. It is common knowledge that this is how it works. There is not enough research funding (The Netherlands has been underspending for years now, compared to international agreements, tho this problem is bigger). Research is not done efficiently. Etc, etc. But just an excellent research proposal is not enough anymore. Only a percentage of excellent proposals get funded (I guess about ~10-15% overall, while at least ~20% (rough guess) of all proposals is not significantly different from the top-ranked proposal). That's a fact, and I leave it as an exercise to the reader to look up the appropriate primary literature. De Volkskrant could have done that. I'm looking forward to their investigative journalism article about that.

In this respect, the interview makes a superb sensational story of what research life is nowadays. Not just Falcko, but basically every researchers in The Netherlands. A recent study showed that most researchers work overtime. Personally, I have been working as much as a full professor while only having a assisting professor position. Fairly, part of that goes into outreach, but how else do you have get "headhunted" by a newspaper as being, well, what, important?

Red flags
The interview has a number of red flags for me. Things that trigger some frowning about the necessity of this interview. I'll put in some quote, translate them to English, and comment.

".. qua werkdruk en stress de heftigste periode uit mijn loopbaan"

English: ".. in terms of stress the toughest period in my carreer". This is not news. This is a known problem, and the reason why we have #WOInActie in The Netherlands.

"Want zonder geld om – bijvoorbeeld – jonge onderzoekers in dienst te nemen, kun je geen wetenschap beoefenen."

English: "Because without funding to - for example - hire young researchers, you cannot do research." Also not new, but that's not why I bring this up. Because this is sad and something where we are currently in a crappy situation: everything seems to be based on grant proposals. Why does the interview seem to trash the Dutch research funder NWO, where it could also just as well have trashed the Radboud University for not providing the group with funding and make him dependent on low-chance external funding. This too is a long story, but when I did my chemistry degree and PhD in chemistry after that at this same university, each group always had (at least, as far as I could see) a PhD candidate and post-doc. You could do research without grant funding, but simply not as much.

"de vorige drie aanvragen die we voor dit project hadden ingediend, waren door NWO ook afgewezen."

English: "the past three proposals for this project were also rejected by NWO." Now, that's something. But again, not new. Ever since I started research, more than 20 years ago, this has been the world I have been living in. Some research topics are hot, others are not. The fact that black holes were hot in the past (no pun intended), does not entitle any researcher for unlimited funding in the future. Hotness of topics come and go. It is disturbing that De Volkskrant seems to claim it can decide what it hot and what is not. In fact, of course, they do: put something on the front page, and it becomes hot. But De Volkskrant has not reason to complain if a research funder disagrees.

That is disturbing, a newspaper that decides what is important and what is not. At best, the newspaper presents cases in a proper context, and leaves the conclusion to the reader. Pointing fingers at a funder because your pet project did not get funding this time. Very disturbing. I have only to look at the UK where that ultimately leads.

"Waarom zou je de bal dan afleggen op iemand die (nog) niet in scoringspositie staat?"

Okay, I'll translate this to the meaning, instead of literally. Translate the full paragraph to see where this context came from. English: "Why would one fund research that does not (yet) have a chance of succeeding?" This quote comes from the interviewed researcher directly. This is very naive of this seasoned professor, if you ask me. It degrades him to a whining boy, because the teacher in kindergarten said that now it was time for someone else play with the toy too.

But there is a second dimension to this, one that has been discussed for years and years, and it is unbelievable the interview cannot respond with something better than "Can you explain?". Of course, what I'm hinting at is the "winner takes all" approach which greatly reduces the diversity of research. There is no sound evidence this is how you innovate or benefit society. We all know that many research breakthroughs were not predicted, resulted from chance, etc. Very complex question.

However, the bottom point, any serious funder like NWO knows, is that you need to fund also promising research. In fact, the European Research Council was set up specifically for that. The newspaper knows that and Falcko knows that. This schwalbe is also disturbing and very bad journalism (pun intended).

"de koevoet die we nodig hadden om serieus te worden genomen door onze Amerikaanse collega’s"

English: "the lever we needed to be taken seriously by our American colleagues". I don't think I have to explain the problems with that. Again, there is zero news here. Yes, international collaboration is important. Yes, many Dutch researchers with international collaborations also see grants get rejected. No, the prestige of your collaborators is a thing but does not make your research more important. Etc, etc. This line of trash talk against NWO goes on (how dare they not fund this international collaboration):

"We zitten aan tafel met topuniversiteiten als MIT in de VS"

English: "We are sitting at the table with top universities like the MIT in the U.S.A.". Again, nothing new. Many Dutch researchers collaborate with people at top universities. But it is totally irrelevant. There are so many angles here, that that perhaps explains why a nonsense quote like that made it into this article. First, these top universities. "Top" here is subjective (enough literature about that). Partly, they are "top" because they are "large". Then you see the problem: working with the largest comes easy: there are simply many researchers there to work with, and all of them are eager to work with you, because it adds to their prestige (it makes them even larger).

Size matter, you can argue. This argument has been frustrating research funding for many years now. Where grants are rejected because your IF scores are not large enough, because your list of publications is not inflated because you have many people working in your group. Etc, etc. This is a can of worms so large, YOU CANNOT SPELL RED FLAG BIG ENOUGH.

And that recursion. Oh, sigh. Now, just for a second, couple this argument (no, not really) to the previous one. It effectively says: if big it should get bigger and if it is not big (yet), why should it get bigger. Well, there is enough written in economics about diversification, and I am not an economist, but I hope you see my point.

Okay, there is so much more, but I got grant proposal to write on my Sunday, just like every other researcher in The Netherlands. One last one, related to this:

"Een goed voorstel schrijven voor de EU is iets waarmee je twee maanden fulltime bezig bent."

English: "Writing a good EU proposal takes two months fulltime." I cannot disagree with that. Been there, done that. With a success rate of about 15% this means a year of full time writing for a grant. Falcko should feel himself lucky he has a position where he can take this risk. Most ECRs that are fighting for funding to continue their groundbreaking research do not get the opportunity to free up their schedules. That explains why research money, results in more research money (see that recent study on the NWO grant system).

Nothing to see here. Keep calm, and move on.

Oh, George van Hal, I hope to meet you at the next #WOInActie strike. I can introduce you to a few friends who also regularly work hard, on Sundays, and also see proposals rejected. Van Hal, maybe you found this interview news, but I find it hard to say "even goede vrienden", because this framing of a rejected grant hurts science, and therefore you hurt me. I hope this post explains a bit why that is the case. I'd love to continue talking about it, even if I am not a hotshot black hole photographer.

Tuesday, September 24, 2019

new paper: "The metaRbolomics Toolbox in Bioconductor and beyond"

Forget about Python being the prime data analysis platform: there are plenty of alternatives and R has been one of them. With CRAN, rOpenSci, Bioconductor (doi:10.1186/gb-2004-5-10-r80) the platform has three efforts where you can publish your R work. I think of them as scholarly journals: the peer review is strong with them. Anyways, over the years I did my share of R coding (a good bit of my PhD is written in R) and contributed to a few R packages. Nowadays I don't do a lot of R coding anymore.  (Sorry, genalg users: I know this package needs some serious love, and a huge thank you to those (like Michel Ballings) who have picked up the package!!)

But regarding the packaging, I still contribute my bits. For example, with rWikiPathways and BridgeDbR. So, I happily accepted the invitation to contribute to a paper that was published this week and outlines a ton of R packages that are used in the data analysis of metabolomics data: The metaRbolomics Toolbox in Bioconductor and beyond (doi:10.3390/metabo9100200), led by Jan Stanstrup and Steffen Neumann. And many R packages it discusses indeed! The paper is like an atlas, showing you around in a adventurous world of metabolomics, as clear from this dependency graph of Figure 2:

CC-BY. Figure 2 from the article.

But there is more ongoing. The article, being CC-BY is being rewritten as a book, and I have some work left to do to add BioSchemas to Bioconductor R package web pages, get more packages to use BioSchemas in their package vignettes (so the ELIXIR TeSS can automatically pick them up), and there is some more awesomeness being discussed. Well, that's not there yet, but you can start reading this metaRbolomics bible.

Thanks to everyone involved!

Sunday, August 25, 2019

Finding potential reviewers using Scholia

First, if you like to learn more about Scholia, check this list of previous posts.

Now, yesterday I had to invite reviewers for a submission to the Journal of Cheminformatics. This can be hard, and is harder when more authors are involved, from multiple institutes. Existing tools by publishers (including SpringerNature) do not exceed in detecting possible CoIs. In fact, they already have trouble finding authors with expert knowledge. This is where I come in. But it's easy to overlook possible CoI. Anecdotally, I once send our a review request by accident to a reviewer sitting in the same corridor.

So, I want safety checks. The more, the better. Same institute/city? Better not. Published together in the past three years? Maybe. Currently collaborating? No one checks joined grants. Seriously, we rely on honesty from the reviewers (though open peer review would encourage that honesty even a bit more). But FAIR data can help us here. This is, for example, one reason why I am happy the journal now requires ORCIDs for all authors of a manuscript (see doi:10.1186/s13321-019-0365-4).

Finding potential reviewers using Scholia: a recipe
(orginally published as this twitter thread)

So, I have a set of author ORCIDs of a submitted manuscript, and a list of potential reviewers... how do I know if any two on the two lists have recently worked/published together. First, I can make a WDQS query like to get the items for the ORCIDs (for a published article, not the submission):

I can extend this query to look up and summarize these authors in #Scholia with this query,

This Scholia links shows this page:

This Scholia page immediately shows me which articles these authors wrote together. I can now just add the Wikidata QID of the prospective reviewer and see what and when they co-authors... let's say I have Noel O'Boyle in mind as reviewer, I add ",Q28540731" to the Scholia URL and get,Q32565639,Q57415846,Q28540731:

I immediately see that Noel has not published together with the authors of this manuscript. Of course, I have to realize that Wikidata/Wikicite is not complete, but at least gives me some extra safety check. Second, this also does not take into account if they work at the same institute, or have an academic history, as @Ben_C_J mentioned. It also ignores that Noel works at a company collaborating with PubChem, the project of the authors. For that, a different query approach is needed.

A final note, everyone can check if they are in Wikidata with this Scholia URL pattern:${ORCID} where you replace the last bit with some ORCID, e.g. for me: