The Blue Obelisk Data Repository (BODR) is not so high profile as other Blue Obelisk projects, but equally important. Well, maybe a tid bit more important: it's a collection of core chemical and physical data, supporting computation chemistry and cheminformatics resources. For example, it is used by at least the CDK, Kalzium, and Bioclipse, but possibly more. Also, it's packages for major Linux distributions, such as Debian (btw, congrats to their 20th birthday!) and Ubuntu.

It doesn't change so often, but just has seen its 10th release. Actually, it was the first release in more than three years. But, fortunately, core chemical facts do not change often, nor much.

Assume you downloaded a set of GPML pathway files from WikiPathways (doi:10.1371/journal.pbio.0060184) and placed those in a Bioclipse (doi:10.1186/1471-2105-10-397) workspace project, then you can easily analyse all metabolites:

Well, genes and proteins too, but I just happen to like metabolites more.

In fact, more interesting than printing the database source and identifier is perhaps opening them in a molecule table.

Today I learned that #altmetrics did not have a Wikipedia page. It's notable, so I decided to get together some material, citations, etc, and create a page on Altmetrics.

In my opinion, the #altmetrics work is a lot more informative in judging a paper (or researcher) than the journal impact factor (JIF). Still, the JIF is used at many academic institutes to decide on the future of researchers, despite it being uncorrelated with the quality or impact of the paper.

Well, with that few issues (really, I am seriously impressed!), I cannot withhold the CDK community from a new stable release. After all, you must be jumping to implement that new 1.4 version in your tools! (Seriously, I am wondering if we can compose a list of active CDK-based projects/code bases, and what CDK version they use....)

This release is another bug fix release, including a set of fixes for long standing regressions, though in code that is not used by a lot of people.

As many of my readers know, John May recently started working as release manager of the CDK development branch, e.g. resulting in the CDK 1.5.3 development release. He has done very important work for the CDK otherwise too. He is clearly beyond the point of an active contributor, and putting his coding where is mouth is, and is improving the CDK all over the place (read his blog!).

And one of those itches he has (read The Cathedral and the Bazaar) is the unit tests. In fact, I like them too.

Picking a software license can be tricky. You want to allow certain things, require another few things, but certainly do that other thing. I can very much recommend Rosen's Open Source Licensing book (website down?), but GitHub is now also providing a quick overview worth checking out:

Via lwn.net.

Rosen, L. Open Source Licensing (2004).

Of course, I had hardly numbered CTR #7 when I realized that I should solve the SMARTS matching CTR first. But because I had already numbered #7 I had to name this one #8. You know, for historic consistency and not meddling with your lab notebook.... life sucks.

Anyway, Rajarshi wrote a convenient SMARTSQueryTool for the CDK, which makes this CTR rather trivial.

I have previously blogged about how to use the CDK and CDK-JChemPaint to highlight a substructure in a 2D drawing, and I only needed to extend it with SMARTS substructure search code, and I added up with this (resulting in the drawing on the right):

import java.util.List; import java.awt.*; import java.awt.image.*; import java.util.zip.GZIPInputStream; import javax.imageio.*; import org.openscience.cdk.*; import org.openscience.cdk.interfaces.*; import org.openscience.cdk.io.*; import org.opens
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
Pageviews past week
Pageviews past week
1831
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.