SMARTCyp (see papers below) is an integrated computational approach that mixes cheminformatics with molecular modeling approaches to predict the metabolic fate of molecules. This fate is important to various biological aspects of small molecules, and the metabolism can active a prodrug into a drug, make a toxic compound non-toxic, and a non-toxic compound risky.

The tool has been well received by the community, complementing other approaches. Now, the reason why I blog about these papers now, is that the tool uses the CDK for the cheminformatics parts, which I find really cool. In fact, the project has resulted in good feedback on the CDK.
4

I reported earlier how to I uploaded the ChemPedia (RIP) data onto Kasabi. But for ChEMBL-RDF I have used the pytassium tool, not just because it has a cool name :) I discovered yesterday, however, that I did not write down in this lab notebook, what steps I needed to take to reproduce it. And I just wanted to uploaded new triples to the ChEMBL-RDF data set on Kasabi.

I just had a conference call on one of the translational cheminformatics projects I am involved in: Bioclipse-OpenTox. A paper about this project has been submitted, and we are writing up a more practice oriented book chapter (almost done). In writing up a use case, we ran into a recurrent problem: proper cheminformatics handling of input files. Ola suggested to start writing more extensive documentation on what users of the CDK are supposed to do when reading a file.
3

I guess reader of my blog already heard about it via other channels (e.g. via Noel's blog post), but our second Blue Obelisk paper is out. In the past five-ish years since Peter instantiated this initiative, it has created a solid set of shoulder on which to developed Open Source-based cheminformatics solutions.
13

I learned a second trick today (see also this first); this one is about the Semantic MediaWiki (SMW). I was using a trick I learned from RDFIO before, setting Equivalent and Original URIs (though the difference between those, I lost). But I ran into the problem that these equivalent URIs cannot contain hashes (#), or not always it seems.

After some googling, I did not find an answer, and turned to the SMW IRC channel. Saruman was helping out and pointed me to the Equivalant URI wiki page.
1

I normally work with full numerical data, not categorical data. R, when using read.csv() seems to recognize such categories and marks the column as to have factor levels. This is useful indeed. However, I wanted to make a PCA biplot on this data, so was looking for ways to convert this to class numbers. After some googling we, Anna and me, ran into as.integer() which can be used on the factor levels.
7

Just in case you have not run into Henry's blog yet, check it out. His blog makes me so jealous I did not follow up on my basic quantum chemistry education. Implementing Hartree-Fock in Fortran is not nearly as interesting or useful as the stuff he has been blogging about. A second reason you should, is his brilliant use of Jmol (Henry is one of those using Jmol for more than 10 years).
9

The market is seriously changing now. Another year, and Microsoft Windows is no longer the majority OS. Of course, my blog is very specific, and these statistics do not map well to global market shares.
1

Where to host chemistry data? This was the question two people asked a few weeks ago:

The Broader Chemical Community’s View of Uploading Data Putting organic chemistry data on the web I had these two blog posts open in my browser since about the time they were blogged, intending to reply. But I could not come up with a good answer, despite I was hoping to do so. For RDF-based data there are a few options now, such as Kasabi and Science 3.0.
2

QSAR and QSPR are the fields that statistically correlate chemical substance features with (biological) activities (QSAR) or properties (QSPR). The chemical substance can be molecular structures, drug (which are not uncommonly mixtures), and true mixture like nanomaterials (NanoQSAR). Readers of this blog know I have been working towards making these kind of studies more reproducible for many years now.
2
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
Pageviews past week
Pageviews past week
1831
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.