Saturday, December 21, 2013

John May released CDK 1.5.4

Short post, but with big implications. John released CDK 1.5.4 which mostly consists of great work on his side on various important corners of the CDK. Make sure to read his blog, and some highlights:

Everyone who is still using a version from the CDK 1.4 series, should really consider switching over (just read the 1.5.2, 1.5.3, and 1.5.4 changelogs for the reason why). It will make your product faster and more functional.

My contribution to this release mostly consists of the new Isotopes work. As John nicely summarized:

Monday, December 09, 2013

Programming in the Life Sciences #16: Open PHACTS LDA usage

Hot on the heels of the announcement that the Open PHACTS LDA was hit over 8M times, here are some usage statistics of the Programming in the Life Sciences course. Basically consisting of six practical days (with expected reading at home), we see a spiked pattern:

You can always hope that students continue their work at home, and some do:

And some students wait until the very end with finishing their work:

Thanks for Paul Groth for suggesting It's not Open, I think, but functionality like that is very useful indeed.

Sunday, December 08, 2013

Programming in the Life Sciences #15: Sixth project screenshot

After five earlier screenshots, here follow a sixth one. That sums up all the projects I have received at this moment for Programming in the Life Science course. The team of Sam and Oskar decided to use a tree layout to show the pathways in which a selected compound is found, and a subset of targets and compounds for that pathway. To deal with all the asynchronous aspects, they first aggregate all data (from four different API calls), show the progress, and the amount of data found:

And once that data has been collected, the user can create the tree, which will then show something like this for citric acid:

Saturday, December 07, 2013

SPARQL endpoint uptime

Ever since I upgraded to Virtuoso OS 7.0 I have had trouble keeping the SPARQL endpoint at Uppsala University for the ChEMBL-RDF v13 data online (see doi:10.1186/1758-2946-5-23; ChEMBL is CC-BY-SA by the Overington team). It seems I am not getting all the settings right, or not as right as for VOS6 which ran for more than two years without the same downtime issues.

Pierre-Yves Vandenbussche operated a cool uptime service that monitored a SPARQL endpoint at set times and reported on this via a webpage but also a RSS feed. This project now found a new home and renewed development by Pierre-Yves, with the Open Knowledge Foundation:

It looks great, and looking forward to a reinstated RSS feature in a later version. But besides uptime, the new service also reports about SPARQL 1.0 and 1.1 support, though I am not sure how the testing works, because I cannot imaging VOS7 does not support the full of SPARQL 1.1. This is the report for my SPARQL endpoint for ChEMBL-RDF:

Programming in the Life Sciences #14: Two more projects

Two more projects were handed in for Programming in the Life Sciences late last night (see also the first three). Roberto developed a web page where you can enter a search term after which it will search targets based on that term, count the number of pharmacological data, and when selecting a target, it will summarize the IC50 values, pCHEMBL values, and molecular weights, like in this screenshot:

Anniek and Darja had a really interesting idea: start with Alzheimer and find possible drug targets, and with the Open PHACTS 1.3 API that should be possible with, e.g. the Alzheimer pathway in WikiPathways. However, while they got Ensembl identifiers for the targets in the pathway, after struggling for half a day, they could not find any pharmacology data. It turned out that the mappings between the Ensembl and Uniprot IDs were not to be found in the 1.3 cache (which still is the case). So, in the end, I created a JSON identifier mapping data for them to look up the mappings in. They ended up visualizing the results in a HTML table, where the target names were dynamically added to the table's "Coding Protein" column, addressing the asynchronous calling of the Open PHACTS APIs:

Friday, December 06, 2013

Programming in the Life Sciences #13: Another screenshot

I got a one more source code zip file from the Maastricht Science Programme students (see also the first two screenshots). Vincent and Błażej extended the d3.js tree view, showing classification information from ChEBI (they also submitted three patches to the Open PHACTS ops.js):

Programming in the Life Sciences #12: First screenshots

Yesterday was the last Programming in the Life Sciences practical day, and the 2nd and 3rd year B.Sc. MSC students presented their results yesterday afternoon. I am impressed with the results that they reached in only six practical days. I have suggested them to upload the presentations to SlideShare or FigShare (with the advantage that you get a DOI), and asked them to send them their tools. Below are some screenshots.

The first app is by Tim and Taís, and look up activities from the Open PHACTS platform and filters it for activities related to a set of five anti-oxidants (see also their FigShare):

The next app is by Janneke and Lukas and uses the Open PHACTS API to report on single protein targets for the compound the user enters (see also their SlideShare):

More apps will follow soon.