At the moment, I am focusing at two issues:
- QSAR descriptors that change the input (causing other descriptors to randomly fail)
- Cloning of IChemObject (for which last week a rather serious bug was found)
So, I started a new module called diff. If two objects are identical, it returns a zero-length String. If not, it lists the changes between the two classes, in a way much like that of the IChemObjects toString() methods.
For example, consider this bit of code:
IChemObject atom1 = new ChemObject();The result value then looks like:
IChemObject atom2 = new ChemObject();
atom2.setFlag(CDKConstants.ISAROMATIC, true);
String result = ChemObjectDiff.diff( atom1, atom2 );
ChemObjectDiff(, flag5:F/T)Now, output will likely change a bit over time. But at least, I now have a easier to use approach for debugging and writing unit tests. Don't be suprised to see test-* modules start depending on the new diff module.
Egon, great idea. How does diff take into account the differences in atom numbering between two identical molecules?
ReplyDeleteFor example, if I have one toluene represented with the methyl group on carbon 0 and one with the methyl on carbon 1, do I get a zero-length string?
Rich, it does not do that at this moment. Actually, I have not implemented the diff for IMolecule at all yet.
ReplyDeleteThe universal isomorphism tester could be used to find a possible atom mapping, but one would already run into trouble when the molecule has symmetry.