The Dutch company PANalytical has made their HighScore software available (some details in this README) for use in the Crystallography Open Database.

ICDD is rumored not to be amused by the contribution of the HighScore-based search functionality, and rumored to be claiming breach of intellectual property. I have not seen either any ICDD patents nor the HighScore implementation, but clearly there is a conflict of interest.

BTW, Panton Principle endorsers may be interesting in signing the petition for Open Data in crystallograph too.

This weekend there was the really nice Science Commons Symposium, which I virtually attended, and there is an interesting discussion at FriendFeed on article level metrics.

Now, I just reported on the CDK functionality used in published research. Linking this to impact, the CDK with 115 citations now (both papers, nice increase from 2006) is not doing bad. But the real impact goes further than the direct citations.

I already gave a wordle of the titles of papers citing the first CDK paper. Below follows some additional statistics: the number of papers that use a particular CDK package (51). Now, this numbers are a bit rough, and surely any paper that uses the CDK is bound to use the IO or SMILES package too. Additionally, for 10 papers I was not sure what CDK functionality they used, so I assigned those to the root package.

Posted via email from Egon's posterous

This is the Wordle after I analyzed all papers citing the first CDK 1 paper:

Clearly, NMR is now less important, though it is indeed overall one of the more important use cases of the CDK. Chemical and molecular are important terms, and considering the molecule is the primary use case of the CDK right now.

The announcement of the Panton Principles is the big news today, though Peter already spoke about them in May last year (see coverage on FriendFeed and Twitter). The four principles list in their short versions:

When publishing data make an explicit and robust statement of your wishes. Use a recognized waiver or license that is appropriate for data.
1

Two weeks ago, a paper by Peter Ertl was published about Molecular structure input on the web (doi:10.1186/1758-2946-2-1). In this paper, he discusses the state of things and describes his contribution to this field, the JME Molecule Editor. The article also cites the CDK, but only the website and not one of the two papers (doi:10.1021/ci025584y, or doi:10.2174/138161206777585274). This is not an isolated case, but a common pattern. In principle, the proper work is cited, and nothing is wrong.
3

Posted via email from Egon's posterous

Posted via email from Egon's posterous
2

In a series of SPARQL end points, I am happy to present a new Virtuoso 6.1-hosted SPARQL end point for the ChEMBL database (CC-BY-SA), at our groups new rdf.farmbio.uu.se server. The server is hosting 23.8M triples, with the data based on ChEMBL 02.
About Me
About Me
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.