A trick I used from Miguel Howard (of Jmol fame) is to not concatenate String until really needed. String concatenation will allocate new Strings, and thus burden the garbage collector (yeah, you lucky Fortran users who think twice before allocating memory :). Some time ago, he suggested this trick for the logger. I think in Jmol, but the CDK inherited that idea.

Don't concatenate Strings when calling the LoggingTool

Logging in the CDK can be turned off, and in production use, I often do. Therefore, we can make it speed up by concatenating debug message String only when debugging is turned on. The CDK LoggingTool can take care of this, so you should pass the Strings are individual parameters to the logger call.
1

A very quick, short post on CDK atom types. These atom types are used by the CDK to decide how many missing hydrogens an atom has, or how many lone pairs, and if the atom can be part of an aromatic ring system. The CDK code basically consists of three parts: the atom type ontology, the atom type perception code, and the rest of the code that uses information from the ontology.

Most of you have already developed a love/hate relation with the CDKAtomTypeMatcher.

For a while now I have been a so-called invited expert to the Linking Open Drug Data (LODD) task force of the W3C's Health Care and Life Sciences Interest Group (HCLSIG). I also participate in the open-science group of the Open Knowledge Foundation (OKF). This is not really worth blogging about if the two are not being mashed up. Members from both sides are interested in learning how Open (think Is It Open Data?) the open data from the LODD network really is.
1

Somewhere in January I added a new New Wizard to Bioclipse for OPSIN, but forgot to blog about that earlier, which will be available in Bioclipse 2.6 later this year (or just the hudson build service).
7

Science 3.0 needs a facelift before I can switch away from FriendFeed:

It comes down to the fact that the website has way too much information, and uses way too much space for things that do not matter. The interesting content only starts half-way my screen. I tend to have given up finding the interesting bits by then.

But even within each thread there is room for improvement. There too, there is abundant whitespace, though that might prove functional when the layout becomes more compact.
4

Last Friday a virus kicked in. Don't know whether flu or cold, but it had already ruined my 5-6 February weekend too. I need high fever to keep bed, otherwise I just can't. Worse, when having an elevated temperature, I cannot think clearly anymore, and I haven't been able to think clear since about last Wednesday, so I should have seen it coming.

Yesterday I did some boring plumbing: I upgraded our GC/MS machines with new hardware. Well, the cheminformatics equivalent of it anyway. I upgraded the Bioclipse bundles in org.openscience.cdk for CDK 1.3.8 and CDK-JChemPaint 17. This is typically a painful process, and now even more because a lot is changing with how Bioclipse is build, which is with Buckminster on Bioclipse' Hudson server.
1

The readers of Antony's blog know enough about the problem. And many in the QSAR community know it too (and many other do not). Chemical structure data is noisy. I haven't recently created a new local data set for analysis, so I have not taken time to blog about it much, but the ambiguity in chemical databases is enormous. Just yesterday, Antony and I had a good discussion about tautomers and in particular how things are linked together.
4

Update: the fourth edition is out.

Some project are never finished.
11
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
Pageviews past week
Pageviews past week
1831
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.