Sunday, July 17, 2016

Use of the BridgeDb metabolite ID mapping database in PathVisio

A long time ago Martijn van Iersel wrote a PathVisio plugin that visualizes 2D chemical structures of metabolites in pathways as found on WikiPathways. Some time ago I tried to update it to a more recent CDK version, but did not have enough time at the time to get it going. However, John May's helpful DepictionGenerator made it a lot easier, so I set out this morning in updating the code base to use this class and CDK 1.5.13 (well, strictly speaking it's running a prerelease (snapshot) of CDK 1.5.14). With success:

The released version is a bit more tweaked and shows the 2D structure diagram more filling the Structure tab. I have submitted the plugin to the PathVisio Plugin Repository.

Now, you may know that these GPML pathways only contain identifiers, and no chemical structures. But this is where the metabolite identifier mapping database helps (doi:10.6084/m9.figshare.3413668.v1): it contains SMILES strings for many of the compounds. It does not contains SMILES string from Wikidata, but I will start adding those in upcoming releases too. The current SMILES strings come from HMDB.

To show how all this works, check out the below PathVisio screenshot. The selected node in the pathway has a label uracil and the left most front dialog was used to search in the metabolite identifier mapping database and it found many hits in HMDB and Wikidata (middle dialog). The Wikidata identifier was chosen for the data node, allowing PathVisio to "interpret" the biological nature of that node in the pathway. However, along with many mapped identifiers (see the Backpage on the right), this also provides a SMILES that is used by the updated ChemPaint plugin.