|Visualization of functional groups. |
Public domain from Wikipedia.
Functional Group Ontology
Sankar et al. have developed an ontology for functional groups (doi:10.1016/j.jmgm.2013.04.003). One popular thought is that subgroups of atoms are more important than the molecule as a whole. Much of our cheminformatics is based on this idea. And it matches what we experimentally observe. If we add a hydroxyl or an acid group, the molecule becomes more hydrophylic. Semantically encoding this clearly important information seems important, though intuitively I would have left this to the cheminformatics tools. This paper and a few cited papers, however, show far you can take this. It organizes more than 200 functional groups, but I am not sure where the ontology can be downloaded.
Sankar, P., Krief, A., Vijayasarathi, D., Jun. 2013. A conceptual basis to encode and detect organic functional groups in XML. Journal of Molecular Graphics and Modelling 43, 1-10. URL http://dx.doi.org/10.1016/j.jmgm.2013.04.003
Linking biological to chemical similarities
If we step aside from our concept of "functional group", we can also just look at whatever is similar between molecules. Michael Kuhn et al. (of STITCH and SIDER) looked into the role of individual proteins in side effect (doi:10.1038/msb.2013.10). They find that many drug side effects are mediated by a selection of individual proteins. The study uses a drug-target interaction data set, and to reduce the change of bias due to some compound classes more extensively studies (more data), they removed too similar compounds from the data set, using the CDK's Tanimoto stack.
Kuhn, M., Al Banchaabouchi, M., Campillos, M., Jensen, L. J., Gross, C., Gavin, A. C., Bork, P., Apr. 2014. Systematic identification of proteins that elicit drug side effects. Molecular Systems Biology 9 (1), 663. URL http://dx.doi.org/10.1038/msb.2013.10Drug-induced liver injury
These approaches can also be used to study if there are structural reasons why Drug-induced liver injury (DILI) occurs. This was studied in this paper Zhu et al. where the CDK is used to calculate topological descriptors (doi:10.1002/jat.2879). They compared explanatory models that correlate descriptors with the measured endpoint and a combination with hepatocyte imaging assay technology (HIAT) descriptors. These descriptors capture phenotypes such as nuclei count, nuclei area, intensities of reactive oxygen species intensity, tetramethyl rhodamine methyl ester, lipid intensity, and glutathione. It doesn't cite any of the CDK papers, so I left a comment with PubMed Commons.
Zhu, X.-W., Sedykh, A., Liu, S.-S., Mar. 2014. Hybrid in silico models for drug-induced liver injury using chemical descriptors and in vitro cell-imaging information. Journal of Applied Toxicology 34 (3), 281-288. URL http://dx.doi.org/10.1002/jat.2879