Sunday, April 27, 2008

Comments on 'Rethinking software access'

bbgm was rethinking software access. The blog observes:
  1. current commercial licensing is unfriendly towards home science
  2. bench tools do not easily allow mash ups
About 1
Actually, much of the work I have been doing in opensource chemoinformatics was done as 'home' science; I started as organic chemist student, and later data analyst, while the CDK/Jmol/JChemPaint was something I did at home because I liked, and needed it. I started in 1995 working on a website to aid my organic chemistry studies, the Woordenboek Organische Chemie (open data). And, I needed semantic tools for 2D and 3D display of molecular structure. Commercial offerings were not an option, for me as student, so I got involved with the Chemical Markup Language, Jmol and JChemPaint in 1997-98.

Note, that in that time free academic licenses were rarer than now. I always had, and still have, the feeling that those clauses are just there to give academics a reason to support non-opensource tools. Also note that a lot of commercial offerings started as incorporation of the code base of some PhD work. Not uncommonly, the PhD would simply be hired by the company.

Fact is, commercial chemoinformatics licenses are indeed unfriendly for scientists who maintain related hobbies at home. And, given my experience, I appreciate your worries: the high costs for those tools, which I certainly could not afford with my student funding, drove me to the opensource ideas many, many years ago.

About 2
The second issue brought up, regards the ability to make mash ups. Open source and open standards are indeed important to make mash ups, though the former only helps you work around lack of use of open standards. Using web services contributes to the solution as it has a well-defined, open standard interface. Open source is particularly important for reproducibility of scientific results (see my thesis), and is the opposite of proprietary software, not commercial software. So, it seems bbgm is just looking for Blue Obelisk projects.

On a practical note, I think that Bioclipse might just be what you are looking for, and integrates local services as well as services on the internet, just alike. Particularly, the upcoming Bioclipse2 is strong at this, and supports SOAP, BioMart, BioMoby for online services (also see this), as well as R, BioJava, CDK, Jmol as local services. You can even run Taverna workflows from within Bioclipse, if you like. Mash ups can be done in various ways. Hard code Java coders would go the RCP plugin way, for example this nanotube example. Others will prefer scripting languages, such as JavaScript and Ruby (in addition to R and Jmol scripting). Or, you might do record as script the tihngs you did graphically, using the recording feature.

Of course, there are other solutions... Bioclipse is just one, one to which I contributed.

About running webservices...
Running webservices, is basically being hosting provider, and requires some commercial model. One conflicting problem is that, at least being said, that large groups withing the potential user base, aka pharma industry, does not even like sending over their highly secret data over an httpS:// line to the outside world.

Rajarshi and the rest of the Indiana group have been running chemoinformatics webservices. They might be the provider you are looking for.

All I can say to bbgm: "Yes, your two thoughts are indeed issues, and many from within the Blue Obelisk community have been addressing them." Oh, and we will not stop either. Peter recently gave in Nature a nice overview of what we, Blue Obelisk members, have been cooking on: Chemistry for Everyone: and that includes the hobby scientist.