I wanted to know when a set of publications I was aggregating on CiteULike was published. The number of publications per year, for example. I did a quick Google but could not find an R package to client to the CiteULike API, and because I wanted to play with JSON in R anyway, I created a citeuliker package. Because I'm a liker of CiteULike (see these posts). Well, to me that makes sense.

citeuliker uses jsonlite, plyr, and curl (and testthat for testing). The first converts the JSON returned by the API to a R data structure. The package unfolds the "published" field, so that I can more easily plot things by year. I use this code for that:
    data[,"year"] <- laply(data[,"published"], function(x) {
      if (length(x) < 1) return(NA) else return(x[1])
    })
The laply() method comes from the plyr package. For example, if I want to see when the publications were published that I collected in my CiteULike library, I type:
    barplot(table(citeuliker::getData(user="egonw")[,"year"]))
That then looks like the plot in the top-right of this post. And, yes, I have a publication from 1777 in my library :) See the reference at the bottom of this page.

Getting all the DOIs from my library is trivial too now:
    data <- citeuliker::getData(user="egonw")
    doi <- as.vector(na.omit(data[,"doi"]))
I guess the as.vector() to remove attributes can be done more efficiently; suggestions welcome.

Now, this makes it really easy to aggregate #altmetrics, because the rOpenSci people provide the rAltmetric package, and I can simply do (continuing from the above):
    library(rAltmetric) acuna <- altmetrics(doi=dois[6]);
    acuna_data <- altmetric_data(acuna);
    plot(acuna)

And then I get something like this:


Following the tutorial, I can easily get #altmetrics for all my DOIs, and plot a histogram of my Altmetric scores (make sure you have the plyr library loaded):
    raw_metrics <- lapply(dois, function(x) altmetrics(doi = x))
    metric_data <- ldply(raw_metrics, altmetric_data
    hist(metric_data$score, main="Altmetric scores", xlab="score")
That gives me this follow distribution:


The percentile statistics are also useful to me. After all, there is a clear pressure to have impact with your research. Getting your research known is a first step there. That's why we submit abstracts for orals and posters too. Advertisement. Anyway, there is enough to be said about how useful #altmetrics are, and my main interest is in using them to see what people say about that, but I don't have time now to do anything with that (it's about time for dinner and Dr. Who).

But, as a last plot, and happy my online presence is useful for something, here a plot of the percentile of my papers in the journal it was published in and for the full Altmetric.com corpus:
    plot(
      as.vector(metric_data$context.all.pct),
      as.vector(metric_data$context.journal.pct),
      xlab="pct all", ylab="pct journal"
    )
    abline(0,1)
This is the result:


This figure shows that my social campaign puts many of my publications in the top 10. That's a start. Of course, these do not link one-to-one to citations, which are valued more by many, even though it also does not reflect well the true impact. Sadly, scientists here commonly ignore that the citation count also includes cito:disagreesWith and cito:citesAsAuthority.

Anyways... I think I need other R packages for getting citation counts from Google Scholar, Web of Science, and Scopus.

Scheele, C. W., 1777. Chemische Abhandlung von der Luft und dem Feuer.
Mietchen, D., Others, M., Anonymous, Hagedorn, G., Jan. 2015. Enabling open science: Wikidata for research. http://dx.doi.org/10.5281/zenodo.13906
0

Add a comment

Hi all, as posted about a year ago, I moved this blog to a different domain and different platform. Noting that I still have many followers on this domain (and not on my new domain, including over 300 on Feedly.com along).

This is my last post on blogger.com. At least, that is the plan. It has been a great 18 years. I like to thank the owners of blogger.com and Google later for providing this service. I am continuing the chem-bla-ics on a new domain: https://chem-bla-ics.linkedchemistry.info/

I, like so many others, struggle with choosing open infrastructure versus the freebie model. Of course, we know these things come and go. Google Reader, FriendFeed, Twitter/X (see doi:10.1038/d41586-023-02554-0).

Some days ago, I started added boiling points to Wikidata, referenced from Basic Laboratory and Industrial Chemicals (wikidata:Q22236188), David R. Lide's 'a CRC quick reference handbook' from 1993 (well, the edition I have). But Wikidata wants pressure (wikidata:P2077) info at which the boiling point (wikidata:P2102) was measured. Rightfully so. But I had not added those yet, because it slows me and can be automated with QuickStatements.

Just a quick note: I just love the level of detail Wikidata allows us to use. One of the marvels is the practices of 'named as', which can be used in statements for subject and objects. The notion and importance here is that things are referred to in different ways, and these properties allows us to link the interpretation with the source.

I am still an avid user of RSS/Atom feeds. I use Feedly daily, partly because of their easy to use app. My blog is part of Planet RDF, a blog planet. Blog planets aggregate blogs from many people around a certain topic. It's like a forum, but open, free, community driven. It's exactly what the web should be.

This blog is almost 18 years old now. I have long wanted to migrate it to a version control system and at the same time have more control over things. Markdown would be awesome. In the past year, I learned a lot about the power of Jekyll and needed to get more experienced with it to use it for more databases, like we now do for WikiPathways.

So, time to migrate this blog :) This is probably a multiyear project, so feel free to continue reading it hear.
4

The role of a university is manifold. Being a place where people can find knowledge and the track record how that knowledge was reached is often seen as part of that. Over the past decades universities outsources this role, for example to publishers. This is seeing a lot of discussion and I am happy to see that the Dutch Universities are taking back control fast now.

I am pleased to learn that the Dutch Universities start looking at rankings of a more scientific way. It is long overdue that we take scientific peer review of the indicators used in those rankings seriously, instead of hiding beyond fud around the decline of quality of research.

So, what defines the quality of a journal? Or better, of any scholarly dissemination channel? After all, some databases do better peer review than some journals.

A bit over a year ago I got introduced to Qeios when I was asked to review an article by Michie, West, and Hasting: "Creating ontological definitions for use in science" (doi:10.32388/YGIF9B.2). I wrote up my thoughts after reading the paper, and the review was posted openly online and got a DOI. Not the first platform to do this (think F1000), but it is always nice to see some publishers taking publishing seriously. Since then, I reviewed two more papers.
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
  • Update 2021-02: this post is still the more read post in my blog. Welcome! Some updates: Ammar Ammar in our BiGCaT group has set up a new SP...
  • Jonathan is here with me to work on his fingerprint project. He asked about CDK modules, which we use to control dependencies, within the ...
  • Earlier this year I gave Mendeley a try, after having been a happy JabRef user, unhappy Connotea user (main problem was that any URI can ...
Pageviews past week
Pageviews past week
5046
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.