Friday, July 20, 2007

OSRA: GPL-ed molecule drawing to SMILES convertor

Igor wrote a message to the CCL mailing list about OSRA:
    We would like to announce a new addition to the set of chemoinformatics tools available from the Computer-Aided Drug Design Group at the NCI-Frederick. OSRA is a utility designed to convert graphical representations of chemical structures, such as they appear in journal articles, patent documents, textbooks, trade magazines etc., into SMILES.

    OSRA can read a document in any of the over 90 graphical formats parseable by ImageMagick (GIF, JPEG, PNG, TIFF, PDF, PS etc.) and generate the SMILES representation of the molecular structure images encountered within that document.

The email does not give any information on the fail rate, but the demo they provide via the webinterface does show some minor glitches (the bromine is not recognized):

The source reuses OpenBabel and uses the GPL license. The value equal to that of text mining tools like OSCAR3, and together they sounds like the Jordan and Pippen of mining chemical literature.


  1. I posted about it yesterday not knowing that you have already posted it. That's funny! I found it in my network and you via CCL ... so the social network seems to work ;-)

  2. Joerg, I am officially on holiday, but reading my email... so, missed the trigger...

    Interesting that you meantion the CCL mailing list as social network... to me, social networks were more like being able to socialize with accounts outside my main areas of interest, which CCL would be...

  3. I did some testing on this the day it was released and found a number of issues during the tests and blogged about it here

    However, as a first release it definitely has potential and I am looking forward to helping them