Pages

Tuesday, March 21, 2017

OpenAPI to bio.tools: the Ensembl example

Already many months ago I joined a bio.tools (doi:10.1093/nar/gkv1116) workshop in Amsterdam, organized by Gert Vriend et al (see this coverage). I learned then how to register services, search, and that underneath JSON is used in the API to exchange information about the services. One neat feature is that bio.tools allows you to specify a lot of detail of the service calls.

Now, at the time we had already used OpenAPI (then still called Swagger) for Open PHACTS for some time, which we later picked up for other projects, like eNanoMapper (API), WikiPathways (API), and BridgeDb (API). OpenAPI configuration files also describe how web services work. So, the idea arose to that it should be possible to convert the first to the second. Simple. I started a GitHub repository, but, of course, did not really have time to implement it.

Then, half a year ago, at the ELIXIR track meeting at the ECCB in The Hague (where I presented this BridgeDb poster), I spoke with people from ELIXIR-DK who were just starting a studentship scheme. This led to a project idea, then a proposal, and then an small, approved project, allowing me to fund Jonathan Mélius to work on this part-time, for about a man month of work, spread over several months.

Jonathan has been doing great work, and because we liked to demo the OpenAPI 2 bio.tools bridge with a major European resource, Ensembl was suggested (which just published a paper on their core software). An OpenAPI for Ensembl was set up, which is going to be the primary input for the new tool:


The next step was to take the JSON defining the content of this page (you can find the URL to the JSON file at the top of that page, hosted on GitHub too), and convert that to bio.tools fragments. That the approach works, shows this test entry in bio.tools:


The observant eye will see that various bits of details of the descriptions of the API calls are annotated with EDAM ontology (doi:10.1093/bioinformatics/btt113) terms, a key feature of bio.tools. This information is currently not available in the OpenAPI JSON (we will be exploring how that specification could/should be extended to do this). Moreover, the webservice API methods need ontological annotation in the first place, and we will not be able to totally remove human involvement there.

The EDAM IRIs are still hard-coded in the conversion tool at this moment, but are being factored out into a secondary JSON file for now. So, the conversion tool will take two input JSON files, OpenAPI + EDAM annotation, and create bio.tools JSON output. The latter can then be inserted into the JSON. We will work on something based on the bio.tools API to automate that step too.

So, we still have some work to do, but I'm happy with the current progress. We're well on track to complete this project before summer and actually get a long way with the ontology annotation, which was an secondary in the original plan.

Feedback welcome!