Sunday, October 30, 2005

CDK News

Just finished applying the latest spelling error fixes to CDK News 2.3. Took me some three hours to finish it up the 12 pages, which has mostly to the need to recompile the PDF after each change to make sure that nothing in the layout got broken.

The content contains four communications:

  • An Open Framework for Online QSAR Modeling
  • Atom types in the CDK
  • MQL - Development of a novel substructure query language
  • Stereochemistry detection in the CDK

And, ofcourse, the recurrent Editorial, FAQ and ChangeLog.

Saturday, October 29, 2005

kfile_chemical gets XYZ, Mol2, SMILES, VMD and GenBank support

Jerome Pansanel contributed new patches for kfile_chemical; on monday actually, but I have been busy with other things, among which a presentation I have to give next monday for some 100+ analytical chemists. The patch adds support to KDE for five new chemical MIMEs: XYZ, Mol2, SMILES, VMD and GenBank. Therefore, I just released a new version (0.10), and added an announcement to

As a reminder, version 1.0 will have all chemical mime types supported, after which I will initiate a process to formalize the meta data we want the kfile plugins to give, which will lead to the 2.0 release. So far, I had in mind that the next step was to make the plugins ready for KDE 4.0, but I became aware of the mime magic as implemented in KMimeMagic.

So, concluding, I might squeeze in another beta release 3.0, where this magic gets addressed; knowing that it will definately not work for all files, but hopefully it will for files with stupid file extensions like .log.

Thursday, October 27, 2005

My birthday (31) and the Adsense

Today is my 31st birthday, nearing half-point now (statistically seen). Also, by now I should have had my scientific moment of glory, otherwise I can forget that Nobel prize. Oh well, forget it.

Have you seen those small advertisements on this page (RSS users, please visit the website :)? Funny links they give. The system is very nice btw: it awaits google indexing of the blog and then decides which ads are relevant. Hence, the links to small chemoinformatics companies. Nice to browse.

Disclaimer, when clicking any or all of the ads, I'll get a bit of money. But don't start clicking away, otherwise Adsense will get upset, and then I get nothing.

Tuesday, October 25, 2005

More cdk.interfaces updates

Yesterday I had some spare time before going to a meeting about the Woordenboek Organische Chemie, so I was boldly going where no one has went before: getting the CDK module core independent of the data module. Why, you might wonder...

Well, if the as many modules of CDK become independent of the classes implementing the data interfaces, i.e. those classes that implement the org.openscience.cdk.interfaces interfaces, then it becomes possible to make alternative implementations. For example, an implementation that also implement the Octet interfaces, or an implementation that extends the JOELib classes. In that way, combining these libraries becomes as easy as writing a blog :)

Anyway, today I finished the AtomTypeFactory, and only the IstopeFactory remains to be updated. Since many classes in the CDK library use these two classes, patches had to be applied throughout the library. And code outside the CDK library might be broken now, so be aware...

Monday, October 24, 2005

JChemPaint applet download size: 538kB

A good functional molecular editor is of much important to the chemical web. There are a few small download sized editors around. JChemPaint has been available as applet for some time now, but the download size has been large. The situation has improved considerable over the past months, and the download size upon which the applet now shows up in your webbrowser is down to 538kB. A live demo is available from

The applet, however, does have the same functionality as the full application. When a feature is used that is not available from the jars downloaded first (which make up the 538kB), additional jars are downloaded.

The applet is not bugless yet. For example, drawing reactions does not seem to work :( But, it's really getting somewhere. Congrats to the applet development team!

Sunday, October 23, 2005

Wrapping up...

Less then three months before the end of my contract of my PhD project. And not nearly done yet. Weekends are now spend on wrapping up bits of experimental research into something like a coherent article. And even lot's of calculations to do to answer the open questions. FreeMind is helping me organize thoughts.

Opensource chemoinformatics is a welcomed diversion now and then. Working on some easy-to-fix CDK bugs yesterday, like the QueryAtomContainer now correctly updated for the recent cdk.interfaces changes. Fixed now. I also touched a lot of code when updating the FSF address in the LGPL license notice, and when I modified the construction of CDKException's to set the causing Throwable. Also helped out Carsten a bit with adding his data from Kalzium to the Blue Obelisk data repository.

Another nice diversion is The Battle for Wesnoth. Just got killed, though.

Friday, October 21, 2005

Viagra saves the environment

This week there was an interesting article in the Dutch Intermediar about viagra. They cite an article in Environmental Conversation and state that it saves the environment as it greatly reduced the market for animal parts from the traditional chinese medicine that address the same problem as viagra does.

Viagra: good for the environment, good for you! ;)

You don't see this often, though. Public opinion, at least in my social environment, is that chemicals (in general) are bad for the environment, what so ever... Natural products are much better. Wait, those are chemical too... but that is to complicated for most :(

BTW, viagra is InChI=1/C22H30N6O4S/c1-5-7-17-19-20(27(4)25-17)22(29)24-21(23-19) 16-14-15(8-9-18(16)32-6-2)33(30,31)28-12-10-26(3)11-13-28/h8-9,14H, 5-7,10-13H2,1-4H3,(H,23,24,29)/f/h29H.

Thursday, October 20, 2005

CDK News 2.3 and InChI's

CDK News 2.3 is scheduled for this month, and origanally planned to be distributed on the CDK5AW event. So, it's a bit late. But the editorial process is converging... I realized that I forgot to mention the requirement for InChI's whenever molecules are given. So, I'm now in the process of going through the issue and add the missing identifiers...

Wednesday, October 19, 2005

Jmol's FAH team in Top 800

The Jmol FAH team has just entered the Top 800 of most active Folding@Home teams. And for that's the point where they start monitoring contributions on a user level. Thus, I can now see how active I am within the team. And so can you! Join the team, and let's get into the Top 500!

InChI meta data with kfile_chemical

I've just uploaded kfile_chemical 0.9. It has new translations for ES and DA, and plugins for InChI files. It will extract the InChI string as meta data (and will thus be used by the KDE desktop search Kat), and the InChI version number.

Thinking about this, it might be useful to extract all layers as meta data, so that one can search on chemical formula and even connectivity, and find all matching structures. Not really close to substructure search, but we'll tackle that later :)

Tuesday, October 18, 2005

CDK-Taverna fully recognized

After asking about it, Tom explained me how Taverna can pick
up the apiconsumer.xml file from jars: just copy it into the root directory of the jar package. Easy as that.

So, users now only need to copy the cdk-taverna.jar into the taverna-workbench-1.3/lib/ directory and have a nice chemoinformatics workbench environment. I'll upload the jar to
CDK's project page right now.

Monday, October 17, 2005

CIA statistics for Blue Obelisk

I have just enabled CIA statistics for the Blue Obelisk SVN: /stats/project/cdk/blueobelisk.

It's done by using the client script and hooked into the $REPOS/hooks/post-commit hook on the SVN server. The client script is slightly hacked to hard code the module name, which otherwise did not show up on the
chat channel.

Saturday, October 15, 2005

Single PDFs for CDK News articles

This week was the CDK5AW event, a workshop for users and developers of the Chemistry Development Kit (CDK). After talking with other developers we agreed on creating PDF and HTML versions of single articles that appeared in the CDK News newsletter. Well, I haven't figured out how to create nice HTML (the latex2html does not give nice results, anyone ideas?), but for the PDF version I now have a pipeline.

For each article, a split.config file determines which pages from the CDK News issue PDF should be extracted. To do this, I used the PDF ToolKit, or pdftk for short (comes with Debian/Unbuntu by default). And using a Perl script to read this config files, the pipeline creates PDF files for each article. Currently, I'll only have it do the features articles; that is, not the ChangeLog, Editorial, Literature and FAQ. For those you'll need to download the full issue. If you don't like that, let me know :)

Ok, you will probably have noticed that the almost server is down (Googling for 'CDK News' allows you read the cache!), and I the PDF's will be uploaded there asap. For those not familiar with CDK News, the articles are FDL, so feel free to copy and distribute them. If you reuse the text and update it, which is allowed too, please let us know.


This new blog will deal with chemblaics in the broader sense, and will not be restricted to research in this field in which I am involved personally.

Chemblaics (pronounced chem-bla-ics) is the science that uses computers to address and possibly solve problems in the area of chemistry, biochemistry and related fields. The general denomiter seems to be molecules, but I might be wrong there.

The big difference between chemblaics and areas as cheminformatics, chemoinformatics, chemometrics, proteochemometrics, etc, is that chemblaic only uses open source software, making experimental results reproducable and validatable. And this is a big difference with how research in these areas is now often done.