Wednesday, July 13, 2011

CDK Forks

Forking is an important part of Open Source development, and forking is good. Of course, forks should interact too, and genes from one fork should merge back into another fork. Forks are probably also a good indication for the success of a project: if a project is forked, it means it is significant. On the other hand, it can also mean that the main project is too hard to work with. Maybe the CDK is that. Indeed, it's easier to not have your code peer-reviewed, and just fork. That is freedom. (There might be other reasons too.)

The CDK is forked. Forked several time, in fact. I have now started a tracker on SourceForge to aggregate information about these forks, and the state with respect to back-integration of code into our fork. I was aware of the AMBIT fork for a long time, as one of the authors (Nina) has contributed. Of the others I only learned via publications (PaDEL, ScaffoldHunter), and in case of Craft, it was a personal ping that made me aware of it. Craft is all the more exciting because the distributor, Molecular Networks, is primarily know for their proprietary products.

Porting all this code back into the main CDK library is not trivial, and often a lot of work. The current core CDK development team will not be able to do this, and the project relies here on contributions from other to do the integration, and convert code from those forks into proper patches. This is likely interest driven, which is one of the reasons why I started the new tracker. The entries report (briefly) at this moment what interesting functionality is available from those forks, but feel free to add comments with detailed information, such as class names that provide that functionality, so that the CDK community can share the burden of reintegrating this code.

OK, enough for now.

ResearchBlogging.orgJeliazkova, N., & Jeliazkov, V. (2011). AMBIT RESTful web services: an implementation of the OpenTox application programming interface Journal of Cheminformatics, 3 (1) DOI: 10.1186/1758-2946-3-18
ResearchBlogging.orgWetzel, S., Klein, K., Renner, S., Rauh, D., Oprea, T., Mutzel, P., & Waldmann, H. (2009). Interactive exploration of chemical space with Scaffold Hunter Nature Chemical Biology, 5 (8), 581-583 DOI: 10.1038/nchembio.187
ResearchBlogging.orgYap, C. (2011). PaDEL-descriptor: An open source software to calculate molecular descriptors and fingerprints Journal of Computational Chemistry, 32 (7), 1466-1474 DOI: 10.1002/jcc.21707