Monday, March 15, 2010

RDF-powered QSAR wizard: SPARQL end points providing wizard content

As you know from my blog, one of the things I am working on is to push RDF functionality in Bioclipse, as I believe it to be the missing link between molecular chemometrics and literature, databases, and other non-numerical information sources.

As part of the submission for the SWAT4LS special issue in the new Journal of Biomedical Semantics, Ola hacked up a cool wizard that sets up a new QSAR Project by downloading data directly from our RDF node for the chEMBL data using SPARQL. The paper is based on the SWAT4LS talk I gave, and the proceedings paper that recently appeared. But with more cool stuff, such as this cool RDF graph browser that allows you to open up molecules from the RDF graph in a JChemPaint editor.

Well, this really nice New QSAR Project wizard was cool enough to trigger a I-want-more reaction, so I just had to hack it up with some additional SPARQL functionality. So, the next version does not only use RDF and SPARQL to aggregate the QSAR data set, it also uses SPARQL to make the wizard interactive. While the user is typing a target ID, the wizard will check the SPARQL end point in the background and download the target's type, title and organism, as well as update the list of activities the user can select depending on what the chEMBL database has for that target:

The actual code base is pretty small, and that's what happens when you mash up the right technologies :)