The problem

That sounds easy: take two collection of identifiers, put them in sets, determine the intersection, done. Sadly, each collection uses identifiers from different databases. Worse, within one set identifiers from multiple databases. Mind you, I'm not going full monty, though some chemistry will be involved at some point. Instead, this post is really based on identifiers.

The example

Data set 1:

Data set 2: all metabolites from WikiPathways. This set has many different data sources, and seven provide more than 100 unique identifiers. The full list of metabolite identifiers is here.

The goal

Determine the interaction of two collections of identifiers from arbitrary databases, ultimately using scientific lenses.
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
Pageviews past week
Pageviews past week
1831
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.