Wednesday, March 04, 2009

Open NMR data: raw curves and annotated peak lists

Games are known to trigger technical innovation. But recently it also triggered innovation on open chemical databases. Jean-Claude reported:
    We are very excited by what we have put together so far. There are currently 457 H NMR, 389 C NMR, 11 IR and 29 NIR spectra. This is only possible because of people who submitted their spectra to ChemSpider as Open Data - please keep uploading!
Now, the NMRShiftDB also hosts quite a number of NMR spectra, and I have a hobby to submit spectra, particularly for rare nuclei. In particular, I think it is fun to to have as many as possible structures which have spectra for all the nuclei in that structure. Benzene is a simple example for which NMR spectra are available for all nuclei (see this entry).

Now, the main difference between the NMRShiftDB and ChemSpider spectral data is the the first are annotated peak lists (each shift is assigned to an atom), and the latter are full, but unannotated, spectral curves. So, there are quite a few things you could do here. For example, see which structures which NMR curves are not yet annotated in NMRShiftDB. Antony pointed me to this pages which is an overview of all spectral data in ChemSpider, but that page is difficult to machine process. Partly, because it is a mix of Open and Proprietary data, and partly because it uses JavaScript to navigate the table. (BTW, RDF interfaces to both resources would be much more helpful, and simply allow me to query all molecules which have a spectrum which is Open, and which is not found in the NMRShiftDB. I am working on a RDF interface to NMRShiftDB.)

Antony also asked me to advertise the option to upload Open spectral curves to ChemSpider. So, hereby. However, I really do hope ChemSpider will make it easier for others to reuse all the Open Data, as having to machine browsing the linked HTML interface is a waste of ChemSpider computing resources.

Update: the game is now available from


  1. Let us know when the RDF is available for the NMRSHiftDB and then we can figure out how to create mutual linking so you have spectral curves viewable on NMRShiftDB and we can have assignments from NMRShiftDB viewable in ChemSPider. Actually, would RDF suffice or would a simple web service whereby we could call assignments for a particular molecule do the job? That might be more efficient for us.Possible?

  2. I'll post it in my blog when I have the NMRShiftDB RDF online. Will take a few weeks, since I am heavily buried in Bioclipse and CDK releases, grant writing, and other stuff. But it's fun :)

    There is a WSDL for the NMRShiftDB SOAP services at:

  3. Tony, please see: