Friday, May 26, 2006

Molecular indexing on the KDE and OS/X desktops

Geoff Hutchinson should blogged about his OS/X ChemSpotLight, an indexing tool for chemistry documents. It's like, but more advanced than, the kfile_chemical and Kat I have been working on (with others) for the KDE desktop (see earlier blog items).

ChemSpotLight currently does more than the KDE tools: it adds Spotlight comments. I assume these are like the Linux extended attributes, used for example by Beagle. For example, a file indexed by Beagle will have extended attributes like:
# file: home/egonw/m43.jpguser.Beagle.AttrTime="20060509071950"user.Beagle.Filter="003 Beagle.Filters.FilterJpeg"user.Beagle.Fingerprint="02 xHn5Yi58x0eoI8ityBYkUw"user.Beagle.MTime="20031225151016"user.Beagle.Uid="YcIW72RWyk+K5FbGnpv4iA"

This is very suitable for adding metadata, like comments as in ChemSpotLight. Geoff's program adds metadata like number of atoms and bond, but it calculates the SMILES and InChI on the fly too. Especially the last is very good for indexing purposes, as it is a really unique identifier for molecular structures, and even works for proteins.

Now, kfile_chemical is a kfile plugin. These kfile plugins only extract metadata from files, and have little to do with calculated metadata. Kat, on the other hand, is an indexing application and might be expected to add additional, derived or calculated, metadata as extended attributes, just like Beagle does. And then InChI and SMILES are good candidates.