Sunday, August 18, 2019

References, citations, and bibliographies. Oh, and tools and formats and APIs.

Give an explorer some tools, and they will study things, they will find new (or better) answers. Every scientist, boy/girl scout, teacher knows this. Give a kid a ball, and they will invent a game. Give them a magnifier, and they will explore a new world.

In the first two, hundreds of years of science, the instruments where physical things, and often the instrument is merely the human brain. In the past 30 years, electronic brains (aka software) has become an increasingly important instrument in software. It's not judgmental, biased, but, of course, only as good as the source code. So, in 1994 I got a new instrument: the Internet (yes, with a capital at the time). One of the things I did at the time was play with new instruments. For example, I played with DocBook. But DocBook did not have BibTeX. So, I wrote BibTex for DocBook. I called it JReferences. It worked for me.

Give an explorer some tools, and they will study things. I got educated and become a scholar.

Now, one thing I love is to show people new instruments (which I do with this blog, for example) and to educate people in the tricks of doing research and being a scholar (~0.5 FTE of my day job). When Lars found an interesting topic, I only had to give him the tools and he would use them. And with time, he started developing new tools, new instruments. Now fairly, he's more dedicated than me, and the tool I want to blog about is so much more well-done than my JReferences :)

 Top half of the first PDF page of the article.
So, at some point I realized that it was worth writing it up, and I advised that. And he did. All I had to do is give him the instruments and explain some of the scholarly tricks, and he applied them very well, resulting in this PeerJ Computer Science publication: Citation.js: a format-independent, modular bibliography tool for the browser and command line (doi:10.7717/peerj-cs.214).

Give an explorer some tools, and they will study things, and they will improve our world.

So, Lars gave me a new instrument: citation.js. In the more than two years the tool now exists, I have used it for two things: first, I used it on my website to give references of typical literature. Second, I use it for the Groovy Cheminformatics with the Chemistry Development Kit and A lot of Bioclipse Scripting Language examples books, as explained in this blog post.

Now, Lars had already implemented a number of features requests I put in. The Altmetric logo was one of them, but also ORCID plugin, that will create a bibliography with just a short snippet of JavaScript and your ORCID identifier (oh, and a populated ORCID profile, of course).

He told me to use his template tool, and I gave it a try. I think I was an early adopter and the amount of documentation has improved since Friday, but with his help I wrote a plugin for PubMed identifiers. So, you can now simply put references in your webpages by just listing their PubMed identifiers (I used this tool to create a custom citation.js bundle with DOI, PubMed, and CSL support):

<html>
<script src="./citation.js" type="text/javascript"></script>
<script>
const { Cite } = require('citation-js')

async function main (pmid) {
let example = await Cite.async(pmid)

let output = example.format('bibliography', {
format: 'html',
template: 'vancouver',
lang: 'en-US',
append ({DOI}) { return doi:\${DOI} }
})
document.getElementById("placeholder").innerHTML = output
}
</script>
<div id="placeholder">
</div>
</body>
</html>


Awesome! Give me some instrument, and I will try to find time to use it to study things. I think I'll be using citation.js in many projects in the coming years :) Note that the append() functionality can be used to add Altmetrics buttons or links to, say, EuropePMC. Well, just read his paper.

Give some a kid, and they will be proud.

Sunday, August 11, 2019

Structure of colibactin elucidated

 Structure of colibactin.
Structure elucidation is still a thing. C&EN reported yesterday that a team has published the structure of colibactin (doi:10.1126/science.aax2685), previously not known, despite the major human health impact (cancer). Now, since the article did not seem to have a SMILES, InChI, InChIKey, or even an IUPAC name, I hope I redrew it correctly (see right). The manuscript and supplementary information is, btw, massive in experimental data. Sadly, little of that is FAIR :(

And because there is no open source IUPAC name generator, I cannot provide that either. But I've submitted the structure to PubChem, so hopefully we have the IUPAC name soon.

In the past I would have provided this info in my blog, but we now have Wikidata and Scholia. So, I created a new Wikidata item for the structure, with some initial info, like SMILES, InChI, and InChIKey (using Bacting, of course):

The new publication does not seem to provide experimental physchem properties of colibactin, but before reading the article in detail, I get the impression they simply do not get to synthesize enough of the compound to do such measurements. They do provide NMR and MS data, though. A lot.

Colibactin is one of those compounds a lot was already known about the biology, and there are some 42 articles in Wikidata that discuss the compound and its biological properties, and I linked them to the new item for the compound, and did some additional annotation, giving this nice Scholia page with this topic graph:

Sunday, August 04, 2019

Contributing to Climate Research?

As a chemist/biologist, my day-to-day work is not really related to climate research. Yet, the effects of the crisis are, of course. I have been pondering how I could contribute my small bits. And after some weeks, I realized that I could repurpose the Zika Corpus idea developed by Daniel Mietchen. And, of course, then there is our Scholia project, where annotation of research articles are visualized. So, given that the climate crisis is a truly global problem, I continued what others had started before me: annotating climate research articles with the region or location they are associated with. That way, you can look up the effects of the climate crisis in your own region.

Mind you, most literature is not annotated with main subject yet, let alone country. But that's at least something I can do (along with taking the train as often as possible, to replace the airplane). And you can join: here's the list of climate change articles without (additional) subject annotation. Another interesting annotation you can do: species.

Europe

Africa (part of it; it's a huge continent!)

U.S.A.

Nanoinformatics page in Wikipedia

This spring I contributed to a joined project, coordinated by the NanoWG, to write a Wikipedia article about nanoinformatics (funded by NanoCommons). I dived into digging up the history of the term nanoinformatics, and isolate a few early sources where the terms was first used, coined if you like. At the same time, the page needed to give an encyclopedic summary of the research field. Thanks to everyone who contributed, in particularly John, Mark, and Fred!

I think we succeeded quite well, and the page has become a rich source, tho far from extensive, of literature. If you want a longer list of nanoinformatics literature, then perhaps check out the Scholia page about nanoinformatics (and notice the RSS feed, to get informed about new nanoinformatics articles):

Saturday, July 13, 2019

Standing on the shoulders: but the shoulders are 200 years old

"Houston, we have a problem. We're standing on the shoulders of old scholars, but it feels a bit shaky."

Well, no wonder. While rocket science has clear foundations, the physical laws of nature, for many other research fields it's trickier. We rely on hundreds of years of knowledge and assume (not trust) that work to be true. And that knowledge is seemingly disappearing very fast (remember my graveyard of chemical literature observation). Published literature, generally, is too hard to reproduce to be seen as an accurate capture of research history. In other words, these shoulders are 200 years old, and our support is failing.

Open Science attempts to overcome these issues. It defines an environment where all research output is important, where every one has access to shoulders, and trust can be replaced by reproducibility. This is a huge transition, ongoing for some 20 years now.

With my work as one of the two Editors-in-Chief of the Journal of Cheminformatics, I try to contribute to making this happy, sooner than later. It's not been an easy ride, and there is so much left to do. And I do not always agree well with the effort put in by Springer Nature here, as clear from this reply.

 Figure 1 from the latest editorial.
But I am happy to work with Rajarshi, Nina, Matthew, and Samuel to supporting the Open Science community in chemistry, for example, by allowing publications that describe a piece open source cheminformatics of software (Software article type). We're limited by what BioMedCentral can offer us, but within that context try to make a change.

The journal now exists 10 years, as marked by our latest editorial. We here describe our adoption of GitHub as a free, extra service, where we fork source code published in our journal, and announce our adoption of the obligatory ORCID for all authors.

These things bring me back to those shoulders. The full adoption of the ORCID allows research to be more easily found (more FAIR) and the copying of the source code aims at making the shoulders on which future cheminformatics stands more solid. Minor steps. But even minor steps matter.

Let's see where our journals takes open science cheminformatics.

Oh, and since you are reading this, I would love to see the American Chemical Society be more open to Open Science too. Please join me in requesting them to join the Initiative for Open Citations.