Nanosafety data for silver nanoparticles
in data.enanomapper.net visualized
with ambit.js and d3.js.
The last three weeks featured two meetings around data infrastructures for the NanoSafety Cluster. The first meeting was on January 25-26 in Brussels, and last week the eNanoMapper project held its second year meeting with a subsequent workshop in Basel (see the program with links to course material). Here are some personal reflections on these meetings, and some source code updates based on the latter workshop particularly.

For the workshop in Basel I extended previous work on JavaScript and R client code for the eNanoMapper API (which I previously wrote about and see doi:10.3762/bjnano.6.165).

JavaScript
Nothing much changed for ambit.js (see these two posts) and I only added a method to search nanomaterials based on chemistry rather than names before (release 0.0.3 is pending). That is, given a compound URI, you can now list all substances with this URI, using the listForCompound() function:

var searcher = new Ambit.Substance(
  "https://apps.ideaconsult.net/enanomapper"
);

var compound =
  "https://apps.ideaconsult.net/enanomapper/compound/71/conformer/71"
searcher.listForCompound(compound, processList);

You may wonder how to get the URI used in this code. Indeed, one does rather than hardcode that, as it may be different in other eNanoMapper data warehouse instances. This is where another corner of the eNanoMapper API comes in, which is wrapped by the Compound.search() method. However, I have to play more with this method before I encourage you to use it. For example, this method returns a list of compound matching the search. So, how do we search for fullerene particles?

R package
The renm package for R was demonstrated in Basel too and some 25% in the audience uses R in their research. The README.md has some updated examples, including this one to list all nanomaterials from a PNAS paper (doi:10.1073/pnas.0802878105):

library(renm)
substances <- listSubstances(
    service="http://data.enanomapper.net/",
    search="10.1073/pnas.0802878105", type="citation"
)

The 0.0.3 release made just in time of the workshop fixed a few minor issues. The above JavaScript example cannot be repeated in R yet, but this is scheduled for the next release.

Data quality
For a few materials I have now created summary pages. These should really be considered demonstrations of what a database with an API has to over, but it seems that for some materials we are slowly going towards critical mass. Better, it shows nicely what advantages data integration has: the data from silver materials comes from three different data sources, aggregated in the data.enanomapper.net instance. However, if you look at the above codes, it is easy to see how it could easily pull in data from multiple instances. For example, here is LDH release assay results for two of the JRC Representative Materials:


This, of course, taking advantage of the common language that the eNanoMapper ontology provides (doi:10.1186/s13326-015-0005-5). This ontology is now available from BioPortal, Aber-OWL, and the Ontology Lookup Service (via their great beta). Huge thanks to these projects for their work on making ontologies accessible!

But there is a long way to go. Many people in Europe and the U.S.A. are working on the many aspects of data quality. I would not say that all data we aggregated so far is of high quality; that is, it somewhat depends on the use case. The NanoWiki data that I have been aggregating (see release 2 on Figshare, doi:m9.figshare.2075347.v1) has several goals, but depending on the goal varies in quality. For example, one goal is to index nanosafety research (e.g. give me all bio assays for TiO2) in which case it is left to the user to read the discovered literature. Another goal is to serve NanoQSAR work, where I focused on accurately describing the chemistry, but have varying levels of amount of info on the bioassays (e.g. is there a size dependency for cytotoxicity).

There is a lot of discussion on data quality, as there was two years ago. I am personally of the opinion that eNanoMapper cannot solve the question of data quality. That ultimately depends on the projects recording and dissemination the data. Instead, eNanoMapper (like any other database) is just the messenger. In fact, the more people complain about the data quality, the better the system managed to community the lack of detail. Of course, it is critical to compare this to the current situation: publications in journals, and it seems to me we are well on our way to improve over dissemination of data via journal articles.

Basel
Oh, and the view from my room in the Merian Hotel was brilliant!


0

Add a comment

Hi all, as posted about a year ago, I moved this blog to a different domain and different platform. Noting that I still have many followers on this domain (and not on my new domain, including over 300 on Feedly.com along).

This is my last post on blogger.com. At least, that is the plan. It has been a great 18 years. I like to thank the owners of blogger.com and Google later for providing this service. I am continuing the chem-bla-ics on a new domain: https://chem-bla-ics.linkedchemistry.info/

I, like so many others, struggle with choosing open infrastructure versus the freebie model. Of course, we know these things come and go. Google Reader, FriendFeed, Twitter/X (see doi:10.1038/d41586-023-02554-0).

Some days ago, I started added boiling points to Wikidata, referenced from Basic Laboratory and Industrial Chemicals (wikidata:Q22236188), David R. Lide's 'a CRC quick reference handbook' from 1993 (well, the edition I have). But Wikidata wants pressure (wikidata:P2077) info at which the boiling point (wikidata:P2102) was measured. Rightfully so. But I had not added those yet, because it slows me and can be automated with QuickStatements.

Just a quick note: I just love the level of detail Wikidata allows us to use. One of the marvels is the practices of 'named as', which can be used in statements for subject and objects. The notion and importance here is that things are referred to in different ways, and these properties allows us to link the interpretation with the source.

I am still an avid user of RSS/Atom feeds. I use Feedly daily, partly because of their easy to use app. My blog is part of Planet RDF, a blog planet. Blog planets aggregate blogs from many people around a certain topic. It's like a forum, but open, free, community driven. It's exactly what the web should be.

This blog is almost 18 years old now. I have long wanted to migrate it to a version control system and at the same time have more control over things. Markdown would be awesome. In the past year, I learned a lot about the power of Jekyll and needed to get more experienced with it to use it for more databases, like we now do for WikiPathways.

So, time to migrate this blog :) This is probably a multiyear project, so feel free to continue reading it hear.
4

The role of a university is manifold. Being a place where people can find knowledge and the track record how that knowledge was reached is often seen as part of that. Over the past decades universities outsources this role, for example to publishers. This is seeing a lot of discussion and I am happy to see that the Dutch Universities are taking back control fast now.

I am pleased to learn that the Dutch Universities start looking at rankings of a more scientific way. It is long overdue that we take scientific peer review of the indicators used in those rankings seriously, instead of hiding beyond fud around the decline of quality of research.

So, what defines the quality of a journal? Or better, of any scholarly dissemination channel? After all, some databases do better peer review than some journals.

A bit over a year ago I got introduced to Qeios when I was asked to review an article by Michie, West, and Hasting: "Creating ontological definitions for use in science" (doi:10.32388/YGIF9B.2). I wrote up my thoughts after reading the paper, and the review was posted openly online and got a DOI. Not the first platform to do this (think F1000), but it is always nice to see some publishers taking publishing seriously. Since then, I reviewed two more papers.
Text
Text
This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!
About Me
About Me
Popular Posts
Popular Posts
  • Update 2021-02: this post is still the more read post in my blog. Welcome! Some updates: Ammar Ammar in our BiGCaT group has set up a new SP...
  • Jonathan is here with me to work on his fingerprint project. He asked about CDK modules, which we use to control dependencies, within the ...
  • Earlier this year I gave Mendeley a try, after having been a happy JabRef user, unhappy Connotea user (main problem was that any URI can ...
Pageviews past week
Pageviews past week
5046
Blog Archive
Blog Archive
Labels
Labels
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.