I don't think I mentioned this JISC project by David Shotton et al. yet, and should perhaps have done so earlier. But it is not too late, as Shotton is calling out for help in a Nature Comment this week (doi:10.1038/502295a). Now, I have been tracking what is citing the CDK literature using CiteUlike since 2010, and just asked the project developers how I can contribute this data.
Interestingly, the visualization from OpenCitations.net is interesting as it also shows papers citing papers that cite the CDK:
This image shows that the corpus is yet small: this CDK paper is cited more then 250 times. In the comment, Shotton writes that "[i]deally, references will come directly from publishers at the time of article publication." I do hope that publishers soon start providing APIs to extract such data. But I like to complement the call out, by inviting everyone to start annotating their old papers with this information, e.g. using CiTO and CiteULike as I did. Importantly, the authors must type their citation, something that will greatly improve the paper itself, anyway.
Now, my own use case, is to get an idea on how the CDK is used. Reason: people are not paying us, so I am limited to reports in the public that write up how they use the CDK. Direct citation is important, but I am even more interested in papers that do not cite the CDK, but cite a paper that describes a tool that depends on the CDK, like PaDEL (doi:10.1002/jcc.21707) which is cited already 73 times. Such papers are traditionally not counted as measure of the impact of the CDK, but surely are. This OpenCitations.net work, combined with CiTO allows just that.
D. Shotton (2013). Publishing: Open citations Nature, 502 (7471), 295-297 : 10.1038/502295a