+Mark Fortner commented on Google+ that my script was missing a @Grab statement. I had seen that mentioned before, but never looked at it. It turns out the be very useful, and it makes Groovy scripts standalone. That is, it will resolve the missing dependencies, using Maven repositories. Fortunately, CDK modules are available from repositories, e.g. the one at Plovdiv University, and I gave it a try.
2

The next CTR I picked is not particularly hard either, given the functionality provided by the CDK. In fact, the fingerprinting functionality I will use for this CTR is actually one of the most used and oldest features of the CDK. CiteULike has a list of 26 papers using the CDK fingerprinting functionality.
2

This one was relatively easy, and roughly based on the first CDK-JChemPaint tutorial. Key aspects are the SMILES parsing, 2D coordinate generation with the StructureDiagramGenerator. The solution does not render the structure's title yet. I have do not have a solution for that right now (the CDK may; I am not sure).
3

The first Chemistry Toolkit Rosetta task is to count the number of heavy atoms in the structures given in a MDL SD file. This tasks starts with an SD file and counts for each structure in the file the number of heavy atoms (non-hydrogen atoms). Because we simply handle the structures one by one, the solution uses the IteratingMDLReader reader. The input file (benzodiazepine.sdf.gz) is a gziped file, which we handle by using a GZIPInputStream.

The Chemistry Toolkit Rosetta wiki was set up some time ago by Andrew Dalke to demonstrate how certain basic cheminformatics tasks are done in the various cheminformatics toolkits around. I think it is a great idea, but never found enough time to do much with it, unfortunately. But it is holiday now, which is a time to take your mind of your work, and then some random hacking with the CDK is what I like to do.
1

Already the 7th edition of my Groovy Cheminformatics with the Chemistry Development Kit book (and PDF eBook). It has been almost two years since the first release and has grown from an initial 72 pages to 212 pages today. There is still a lot I still want to write about, but only during the holidays I have time for it. New content includes: Chapter 6. Reactions Chapter 7. From IChemObject to IChemFile Section 17.1.2. Stereoisomerism (in InChIs) Rewrote Chapter 20. Documentation Appendix D.1.
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
Pageviews past week
Pageviews past week
1831
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.