tag:blogger.com,1999:blog-17889588.post8641581012791557858..comments2024-03-13T07:14:55.283+01:00Comments on chem-bla-ics: Speeding up the CDK: Morgan numbersEgon Willighagenhttp://www.blogger.com/profile/07470952136305035540noreply@blogger.comBlogger4125tag:blogger.com,1999:blog-17889588.post-87748174008176105422011-08-22T10:17:09.908+02:002011-08-22T10:17:09.908+02:00IMolecule(/Set) is going to disappear from the mas...IMolecule(/Set) is going to disappear from the master branch soon.<br /><br />But that would be a bad place to differentiate between implementations. That should be at the implementation side, not defined by interfaces; not in this case, IMHO.Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-73406774862368941192011-08-22T10:12:55.225+02:002011-08-22T10:12:55.225+02:00"interface" REALLY HELPS!
By the way ge..."interface" REALLY HELPS!<br /><br />By the way getConnectedAtoms() too suffers from this lag.<br /><br />Yes it would be nice to have two ways to deal with it rather than the IMolecule and IAtomcontainer which is still confusing for me (connectivity checker is not implemented) ...rather they can support two different implementation..one optimised for graphs and other for general use (basic) :-)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-17889588.post-3003753179329516612011-08-22T06:12:30.094+02:002011-08-22T06:12:30.094+02:00And, this is exactly why we have the interface and...And, this is exactly why we have the interface and their implementations split apart, and why I am working towards making all CDK modules not depend on the data module. That way, we can have two implementations, one optimized for speed, one optimized for memory usage.<br /><br />Now, I did play in the past with alternatives for getConnectedBonds() too, but never found a solution at the time that demonstrated clear performance boosts, though I have may have simply been testing on the wrong data :)Egon Willighagenhttps://www.blogger.com/profile/07470952136305035540noreply@blogger.comtag:blogger.com,1999:blog-17889588.post-84678912532830722642011-08-21T17:46:35.080+02:002011-08-21T17:46:35.080+02:00atomContainer.getConnectedBonds() and atomContaine...atomContainer.getConnectedBonds() and atomContainer.getConnectedBondsCount() are bottlenecks in the atomcontainer ( I explored it long back while developing SMSD.)<br /><br />That's one of the reasons I use https://github.com/asad/SMSD/blob/master/src/org/openscience/smsd/algorithm/vflib/builder/TargetProperties.java to speed up the look up part.<br /><br />Another option which I tried was https://github.com/asad/SMSD/blob/master/src/org/openscience/smsd/tools/GraphAtomContainer.java which works on the Adjacency theory and hence look ups are faster.<br /><br />Now present Atomcontainer is memory efficient but Adjacency based atomcontainer has better speed.<br /><br />Again it all depends on the classical comp sc question.... memory vs speed...Anonymousnoreply@blogger.com