Friday, May 15, 2009

ChemSpider and the RSC: where next?

Last Monday the CHMINF-L brought the news to me that ChemSpider was acquired by the RSC (not the press release). Twitter (my Twitter post) and FriendFeed (see this series).

Reading blogs used to be to get the news, but this has changed. Still, blogging gives more freedom, more space. Blogs did soon follow. Chris was the first to blog about it:
    This is great news and I’m confident that it will be a move to even more openess in chemistry and cheminformatics. It will also allow the RSC to use Tony fantastic tools for even more semantic markup of articles. I’m looking forward to talking to everyone about the implications. For now, congratulations, Tony, and congratulations, RSC, for this great deal.
I think Tony himself was next:
    This is good for us for a number of reasons. Specifically we will no longer have to deal with our very significant resource limitations but more than that it lends credence and validation to the work that we have been doing over the past 2 years. It seems so long ago now but ChemSpider was first unveiled to the world at the ACS Spring meeting 2007. What began then only as a hobby project is now being recognized by the community as one of the primary resources for internet chemistry.
His network and insight in required data curation is what I think made ChemSpider a success.

Later views followed from Peter, Rich and Neil. I have only congratulations, which I hereby join, and expect that only future will tell us if our cheers are correct.

Where next?
As Tony indicated, the deal will practically mean better support for ChemSpider in terms of computing power, making if easier for them to make upgrades, hence better uptime, etc. It may, indeed, also mean more data, provided from RSC archives, as suggested by Neil. More practically, I can imagine seeing Project Prospect contributing InChI-DOI links to ChemSpider very soon.

And this would be one of the two recommendations I have to ChemSpider at this moment:

1. now linked to a publisher, and with both text mining efforts and expertise, focus on these InChI-DOI links, and, in particular, focus on those InChI-DOI links which involve papers that describe measured properties of the molecules;

2. with the increased support, finish the Open Data work done, by making it easy for people to download the ChemSpider-OpenData subset. This, I believe, is crucial for a wider adoption in the OpenData community, as OpenData which is practically made impossible to easily download is not Open enough. Previous priorities may have been focused on setting up a viable commercial alternative, but with the RSC backing, this can no longer be a reason to not do this.

Once more, congratulations to the ChemSpider-team and the involved RSC people, and very much looking forward to seeing how this will change chemistry for the better!


  1. Thanks for the post and good wishes Egon, guess you want more like this one then?


  2. Rich, not entirely sure what part of that page you are referring to, but I see:

    1. InChI-DOI links
    2. some experimental data, such melting point

    But, what I like instead, is the boiling point, with the DOI in which that boiling point was given.

    There are no technical reasons anymore why an aggregation of experimental properties should not have the proper citation for each entry. We no longer need the Handbook of Physics... though I have to admit, I forgot to do this too for the Blue Obelisk Data Repository...

  3. Ah OK Egon, I understand now - it was the DOI links together with the data, but I get what you mean (and will add to the list). Thanks

  4. Egon..thanks for the support and the positive response. I did have some concern that we would be seen as "selling out" but the choice of the RSC as the home for ChemSPider, myself and a developing team best matches our mission and intention. We clearly have a lot of work to do in the near future - migrating the ChemSPider data and code onto RSC servers, working on a re-release of ChemSPider later in the year and working on a long list of requested features (and yes, there are known bugs too). Part of that will be to increase the links between structures and DOIS/articles. The ability for anyone to link compounds via DOIs and PubChemID is there already for people to use: We will have access ti a rich archive of information as well as new materials as they release. Watch this space...