Friday, July 06, 2007

Standing on the shoulders of ... the Blue Obelisk

The Seven stones wondered what to do with a petaflop in science, in response to Declan's The petaflop challenge in Nature. Declan discusses in this commentary the increase in computing power and the necessity of parallel programming to make use of it. Now, I do have some ideas (e.g. enumerating metabolomic space, mining the RDF graph of our collective biological and chemical knowledge base for the one hundred most supported contradictions), but that is not what I want to talk about. It is this fragment from Declan's piece:
    "I'm amazed at what he can do just using open-source libraries," [Horst Simon] says. Although there are exceptions, such as high-energy physics and bioinformatics, many labs keep their software development close to their chests, for fear that their competitors will put it to better use and get the credit for the academic application of the program. There is little incentive to get the software out there, says Simon, and such attitudes plague development.

This is something that is very familiar to many of us: developing algorithms for scientific problems is not appreciated. It worries me very much the way the scientific community currently deals with algorithms and data; it seems the community does not care about correctness or improvement at all, as long as the result illustrates what they think the (bio)chemical reality has to offer. At least, that is what effectively happens if they do no give proper credit to the scientific importance of software development.

Of course, scientific credibility of software depends on the open source nature of the software: "Given enough eyeballs, all bugs are shallow", The Cathedral and the Bazaar, E.S. Raymond. Or, in more traditional wording: science, and scientific software, must be reproducible and/or falsifiable. The Blue Obelisk Movement is trying to achieve this (DOI:10.1021/ci050400b).

The open source challenge
Therefore, I hereby challenge all experimental chemists in biologists to acknowledge the amount of scientific software they already use, and give credit where credit is due. I challenge them to stand up and say that chemo- and bioinformaticians provide the methods they rely on daily to achieve there goals. I challenge them to say that they stand of the shoulders of scientific software developers.

The article should not have been called The petaflop challenge, but The open source challenge.