Sunday, March 22, 2009

Journal of Cheminformatics: I hope the Instructions to the Authors improve

Besides Nature Chemistry, another journal was launched last week (see here and here): the Journal of Cheminformatics. First of all, congratulations to Chris and David for their efforts! While the journal only published one research paper yet, it already found its place on Chemical blogspace. I have two things I want to blog about: data rich publishing, and starting the scientific communication.

Data Rich Publishing
Peter had a detailed blog about why he joined the editorial board:
    I take this position with some trepidation as I have grave reservations about the current practice of cheminformatics. It suffers from closed data, closed source and closed standards, and thereby generally poor experimental design, poor metrics and almost always irreproducible results and conclusions which are based on subjective opinions.
I strongly agree with this observation, and have discussed my view on this in my thesis (send me an email if you want a copy).

So, what has the journal to say about this (see Instructions to the Author, emphasis mine):
    Journal of Cheminformatics recommends, but does not require, that the source code of the software should be made available under a suitable open-source license that will entitle other researchers to further develop and extend the software if they wish to do so.
Regarding data, they even less revolutionary; recommended figures formats (EPS, PDF, PNG) focus on nice graphics instead of reuse of data. I also note that I cannot upload data in the Open Document Format, or, hey, let's really push things, in RDF. Well, not according to the Instructions. And surely, I can put the [O|R]DF in the supplementary information, anyway. It would also be nice if I could use Jmol as an applet to enrich the graphics, and improve data reusability of the paper, like the RSC recently started to allow.

Regarding the supplementary information, there is a section on additional files, which, unconveniently are capped at 20MB size. No mention of chemical formats at all, neither any recommendation on semantic formats like CML (I wonder when this was discussed with the Editorial Board, and where Peter was at the time). How am I going to put online my 500 molecular structure CML file now? (Though it's good to know it is virus scanned ;)

So, why do I vent my concerns about these limitations? I had not blogged about the launch of the journal earlier, because I have not made up my mind about it. On one side, I am happy to see a journal that promotes (scientific) use of papers, and a journal that allows me to keep copyright on the material. However, on the other side, what the current Instructions suggest, the data I could use from the papers is available only in an old-fashion way. That's a lost opportunity and could have killed competition for sure. Instead, the unique selling point is now restricted to using an open access license. Nature Chemistry, on the other hand, chose data rich publishing as a selling point (though in competition with things done at the RSC).

The other thing I want to mention about the journal is the following. Rajarshi blogged about Bachrach's paper on Chemistry publication - making the revolution (DOI:10.1186/1758-2946-1-2). Firstly, by adding a link like that for the DOI I just gave, Chemical blogspace can pick it up; we need this later. Secondly, the paper actually suggests that "[b]y publishing lots of data, available for ready re-use by all scientists, we can radically change the way science is communicated and ultimately performed"; this is in strong contrast to what I have seen in the Instructions so far.

Starting the Scientific Communication
Rich replied to Rajarshi about the requirement to log in before someone could make a comment, which he did not like. He suggested alternative ways to prevent SPAM and sorts. The choice for this commenting approach may also originate from having an Open discussion, where everyone takes responsibility for what he says. The use of OpenID, as Rich suggests would only partially address that; on the other hand, setting up a fake email address is quite common in the blogosphere too.

If Rajarshi would have used the DOI to link to the Steven's paper, as said, Chemical blogspace would have recognized it. Instead, he chose to link directly to the PDF. This is a typical case of hamburgers in action. However, others did when they discussed the first research paper in the journal (DOI:10.1186/1758-2946-1-3). These blogs were picked up by Cb and are listed on this page.

Now, I only need to remind you of Userscripts for the Life Sciences (DOI:10.1186/1471-2105-8-487) that we have the methods to link these comments back to the journal website. The Quotes from Chemical Blogspace and Postgenomic script in particular, does the hard work (needs GreaseMonkey, the script can be downloaded here; see also Noel's original post). This way, we can read the comments when we visit the papers homepage:

Now, the script has not yet been updated for the new journal (Noel, can you please upload the revision?), so you need to edit the source right now and add http://** to the list of website the script acts on: