Pages

Friday, November 11, 2016

New paper: "SPLASH, a hashed identifier for mass spectra"

I'm excited to have contributed to this important (IMHO) interoperability paper around metabolomics data: "SPLASH, a hashed identifier for mass spectra" (doi:10.1038/nbt.3689, readcube:msZj). A huge thanks to all involved in the great collaborative project! The source code project is fully open source and coordinated by Gert Wolgemuth, the lead author on this paper. It provides an implementation of the algorithm in various programming languages and I'm happy that the splash functionality is available in the just released Bioclipse 2.6.2 (taking advantage of the Java library). An R package by Steffen Neumann is also available.

This new identifier greatly simplifies linking between spectral databases and will in the end contribute to a Linked Data network. Furthermore, journals can start adopting this identifier and list the 'splash' for mass spectra in document, allowing for simplified dereplication and finding additional information around spectra.

There are several databases that have adopted the SPLASH already, such as MassBank, HMDB, MetaboLights, and the OSDB published in JCheminf recently (doi:10.1186/s13321-016-0170-2).


Screenshot snippet of a spectrum in the OSDB.

PS. I personally don't like the idea of ReadCubes (which I may blog about at some point) and how they have been pitched as a "legal" way of sharing papers, but this journal does not have a gold Open Access option, unfortunately.

Wohlgemuth, G., Mehta, S. S., Mejia, R. F., Neumann, S., Pedrosa, D., Pluskal, T., Schymanski, E. L., Willighagen, E. L., Wilson, M., Wishart, D. S., Arita, M., Dorrestein, P. C., Bandeira, N., Wang, M., Schulze, T., Salek, R. M., Steinbeck, C., Nainala, V. C., Mistrik, R., Nishioka, T., Fiehn, O., Nov. 2016. SPLASH, a hashed identifier for mass spectra. Nature Biotechnology 34 (11), 1099-1101.
http://dx.doi.org/10.1038/nbt.3689