The data, it turns out, is really hard to come by. While I was adding data to the database for most-selling drugs, it was hard to find publications where a human experiment was done (many experiments use rat microsome experiments. Not only makes that hard to identify the specific CYP enzyme, it also is not the human homologue. BTW, since the background of this paper is to create a knowledge base for computational prediction of CYP metabolism, ideally we would even have a specific protein sequence, including any missense SNPs affecting the 3D structure of the enzyme.
However, even for the (at least then) most selling drug aripiprazole, literature was really hard to find! There is a lot of literature just copy/pasting knowledge from other papers, and those other "papers" may in fact be the information sheet you get when you buy the actual drug. Alternatively, personal communication and conference posters can be cited as primary literature too. So, only stressing the importance of a database like this.
At this moment the project is a stalled. None of the currently involved groups has funding for continued development. I guess collaborations are welcome! ChEMBL 22 now was metabolism data for compounds, but I have not explored yet if it has all the details for the transformations needed for XMetDB. At the very least, it may serve as a source of primary literature references.
