Thursday, April 28, 2011

Importing RDF input in R for analysis

August last year I asked on BioStar about how to import RDF into the R statistical package and at the time nothing seemed existing. Over the past few weeks I ported code I wrote for Bioclipse to create the rrdf package for R, which is now available from CRAN. RDF can be loaded from RDF/XML, Notation3, and Turtle files, using Jena, and read data can be queried using SPARQL. I have not yet ported the remote SPARQL functionality we used in our recent paper, but that will make a nice alternative to using ChEMBL data in Pipeline Pilot or in Bioclipse :)

Friday, April 22, 2011

CDK 1.2.x hits Debian Unstable

Thanx to the work of Onkar Shinde, the Debian package for CDK has been updated from 1.0.2 (in Squeeze) to 1.2.7 in unstable. The popularity content shows some 140 installations:

BTW, Jmol 12 is also available as Debian package now!

Thursday, April 21, 2011

ChEMBL 09 as RDF

Update: this work is now written down in this paper. I'm having a really bad month, as you can see from the number of posts. Too much to do, too little time. One of the things I have been doing in the past weeks is update the RDF for ChEMBL, now up to version 09. The SPARQL end point has not been updated yet (which is still at ChEBML 04), but you can now download the triples for self-hosting here. Like the database itself, the RDF is available under the CC-SA-BY license, requiring attribution to both the ChEMBL team as well as our efforts to create the RDF (see this README).

Thursday, April 14, 2011

How to create a proportional Venn diagram?


Above is a typical Venn diagram. But I like mine so that the area of the circles reflect the number of object in that circle (or logarithm thereof), and the areas of the overlaps to be proportional to the number of objects in that overlap. Is there an 'arp' for that? (I know that "there is an app for this" it trademarketed... 'arp', instead, is short for R package).

Posted via email from Egon's posterous