![]() |
CC-BY-SA from WikiMedia. |
Subset selection in QSAR modeling
We (intuitively) know that negative data is important for statistical pattern recognition and modelling. We also know that literature is not quite riddled with such data. This paper, however, studies the effect of using sets of inactive compounds in modelling and particularly the part about selecting which compounds should go into the training set. Like with the positive compounds, in the results of this paper too, the selection method matters. The CDK is used to calculate fingerprints.

Using fingerprints to create clustering trees
I need to read this paper by Palacios-Bejarano et al. in more detail, because it seems quite interesting. The use fingerprints to make clustering trees, which, if I understand it correctly, be used to calculate similarities between molecules. That is used in QSAR modeling of the CLogP, and the results suggest that while MCS works better, this approach is more robust. This paper too uses the CDK for fingerprint calculation.

No comments:
Post a Comment