Friday, January 27, 2006

1D NMR Spectra do not work in QSPR

About two years ago a student started with me to work on the use of 1D NMR and IR spectra in quantitative structure-activity relationship (QSAR) work, with the goal to show that these spectra contain 3D information relevent to QSAR models. It is known that these spectra depend on the 3D conformation of the molecule.

Half a year later we concluded that from the data which we started with (48 compounds with binding affinity), no conclusions could be drawn what so ever: no statistically sound models could be build at all. So, we composed three larger data sets. These sets, all QSPR data sets, did give us models, but all the spectra based models were worse than a Dragon descriptor based model using the same number of variables, without doing any variable selection.

I presented this work at the 7th ICCS in Noordwijkerhout half a year ago, and now got published in the JCIM: DOI 10.1021/ci050282s. Comments on this article are most welcome!

Sunday, January 22, 2006

Trouble running the CDK JUnit tests with Cacao and Kaffe

Because I am still looking forward to testing CDK against the latest Classpath 0.20, I downloaded cacao 0.94-1 for Debian sid, then tried to compile CDK with it:

JAVA_HOME=/usr/lib/jvm/cacao ant -Dbuild.compiler=gcj clean test-all

But that hangs at some point with zero load. I have no idea what is going on there. I've spoken with twisti on the #classpath IRC channel, and he helped me run the compile with gdb, which indicated that at some point all threads were waiting.

I also tried it with kaffe in sid, but now with a XML parser in the CLASSPATH, as Dalibor in a previous blog item suggested:

export CLASSPATH=/usr/share/java/xercesImpl.jar:xmlParserAPIs.jar
JAVA_HOME=/usr/lib/kaffe ant -Dbuild.compiler=gcj clean test-all

But that failed too with:

[junit] Running org.openscience.cdk.test.CDKTests
[junit] kaffe-bin: /home/mkoch/debian/kaffe/kaffe- translate: Assertion `reinvoke == false' failed.
[junit] Test org.openscience.cdk.test.CDKTests FAILED

It did work previously :(

OK, to reproduce this yourself, you need to check out CDK from CVS (hoping that anonymous CVS is reasonable in sync, and online) with:

cvs login
cvs -z3 co -P cdk

Thursday, January 19, 2006

Free at last!

Free at last! Well, not quite yet, but close enough anyway: my PhD contract has ended; last friday was my last working day, which my collegues and I celebrated with a visit to Nijmegen oldest bar, In de Blauwe Hand. But I still have my manuscript to finish. This formally ends a period of almost 12.5 years at the Radboud University Nijmegen.

Starting last monday I'm at home, trying to get things finished as soon as possible. Mostly working on my laptop, remote logged in into our desktop machine downstairs. A good ADSL (170kB downstream) helps a lot too, and the proxy on my university machine allows me to access the full access journals of my university.

I'm trying to dome some open source chemoinformatics in between writing, and my current QSAR research actually allows me to do some feature enhancement in CDK's QSAR package too. Today, I hope to write and finish a config file architecture that allow fine tuning which QSAR descriptors should be calculated. I anticipate a default config files to be distributed.

Additionally, I will try to finish running teh CDK JUnit test against Classpath 0.20, which 98% of Java 1.4.2 covered, and the limited support for HTML rendering is most of this last 2%. The Classpath progress has really amazed me over the last few weeks. I have not tested Jmol and JChemPaint against the latest open source java tools, but will try to do that before I go on holiday next week. Results with 0.19 were very promising, as I reported in earlier blog entries.

Wednesday, January 11, 2006

USPTO considers open source software prior art

This is the best news I heard in weeks! The US Patent and Trade Offfice spoke with open source representatives about ways to deal with open source software as prior art. Apparently, their problem was how to be sure about release dates of open source, and authoritative sites like, help a lot here, which extensive logging of releases.

Quoting from there website:

The Department of Commerce’s United States Patent and Trademark Office (USPTO)
has created a partnership with the open source community to ensure that patent
examiners have access to all available prior art relating to software code
during the patent examination process.

It also indicates that releasing open source software with, or announcing it on, such an authoritative website is important! Otherwise, patent offices will not be able to decide wether our open source art is really prior.

Friday, January 06, 2006

Open Source Java tool chain: CDK compiles and JUnit tests run

While waiting for a Dragon calculation to finish (it does not work for molecules with more than 300 atoms!), I updated CDK's build.xml to support gjdoc. The build script is now able to compile the custom doclets we use for creating the src/*.javafiles and others from the Java source files. And using gij I could also run CDK's 1688 JUnit tests!

On my Debian GNU/Linux sid chroot, I have java-gcj-compat installed allowing me to do (thanx man-di!):

JAVA_HOME=/usr/lib/jvm/java-1.4.2-gcj-4.0- ant -Dbuild.compiler=gcj runDoclet
JAVA_HOME=/usr/lib/jvm/java-1.4.2-gcj-4.0- ant -Dbuild.compiler=gcj test-all

The first command creates the custom doclets, while the second command compiles the CDK and runs the JUnit tests. For Classpath developers: here's how to check out the cdk module from CVS.

The results are interesting: while Sun's JVM gives 11 problems, gij gives 399 problems. The test-all target creates a reports/result.txt document listing all failing tests, and I've put the diff -u for the two JVMs online. I will make diffs for jamvm, kaffe and cacao too.

I hope this gives the free Java community extra feedback on the excellent work they are doing.

Tuesday, January 03, 2006

Kubuntu, XRandR and TV-OUT

One of the things I had not fully figured out up to today, was how to configure my Kubuntu system to easily view DVDs on our TV, using my NVIDIA's TV-OUT. I've seen xorg.conf files that define a X11 server for the monitor and a second for the TV, and files that use TwinView. Now, I did not really like the way first option worked, so tried the second.

Unfortunately, I had to reconfigure and restart my X11 each time my kids wanted to see Bob the Builder. I already knew about XRandR, and today finally had a look at it again, and got it to work without much trouble this time. (Lesson: if something does not work, let it rest and try again half a year later.)

For the googlers, this is what my xorg.conf 'Screen' section now looks like:

Section "Screen"
Identifier "Default Screen"
Device "NVIDIA Corporation NV18 [GeForce4 MX 4000 AGP 8x]"
Monitor "Hansol H711"
DefaultDepth 24
Option "TwinView" "on"
Option "TwinViewOrientation" "clone"
Option "SecondMonitorHorizSync" "30-50"
Option "SecondMonitorVertRefresh" "60"
Option "MetaModes" "1280x1024,1280x1024;1024x768,1024x768"
Option "TVStandard" "PAL-B"
Option "TVOutFormat" "SVIDEO"
Option "ConnectedMonitor" "crt, tv"
SubSection "Display"
Depth 24
Modes "1280x1024" "1024x768" "832x624" "800x600" "720x400" "640x480"

And now, to switch resolution, I can just do:

sudo xrandr -s 1
# watch DVD
sudo xrandr -s 0

PS. Happy new year!