Sunday, August 07, 2011


Usability. I am not an expert in Human-Computer Interaction (HCI) at all. Worse, I make the crappiest looking interfaces, typically. So, that's said. Usability. Wikipedia writes that "[U]sability is the ease of use and learnability of a human-made object."

A cheminformatician is, despite doing cool science, per popular demand by peer scientists, also a HCI expert to at least some extend. Scientists want usability. It is merely an extension of any scientist being a Human-Paper Interaction (HPI) expert to some extend (you know, getting the bibliography properly typeset-ed).

Now, what is usability. What is it that someone means if he says your system has a 'usability issue'? That causes any cheminformatician to be some sort of HCI expert. I have had usability discussions many more times than I personally care about. Too often these discussions are held without defining who the users actually are. Are they chemist/biologists for whom Excel is the supreme data analysis tool, or statisticians who work with Matlab or R, or are they hackers (like Pierre or Neil perhaps) who just want to get their work done.

Taverna and KNIME primarily target a user who is thinking visually and who like to see what happens with their data. Jmol users do not even what to see what happens to their data (file reading, etc), and only care about seeing it it nice colors. The Chemistry Development Kit on the other side is targeted at hackers who know and want to know in detail what they are doing and what is going on.

Importantly, the last paragraph talks about the most visible part of usability: ease of use. In particular, easy of use to humans. However, readers of my blog there is more than humans: there is software too, and these too are users of a system. Here the easy of use is defined by the Application Programming Interface, or API.

So, any system is oriented at multiple user types. And each user type will have their own set of requirements. So, in a requirement analysis process, you identify the user types and associate requirements to those. Now, my software engineering book is hidden in some box, and I can therefore not cite some good practices standards right now, but the bottom line is that talking about usability without a set of project-defined user types is difficult, and may in fact result in heated discussion, where people probably want to same thing, but just are not aligned, resulting in confusion of priorities. (This sounds wise but I get fooled each meeting again myself.)

Targeting more than one user type double the effort. Yet, in science this is important. Particularly for large projects where a lot of user types are expected to interact anyway: project manager, bench chem/biologists, statisticians, data warehouses, etc. An agreement on what users are being target are core to the analysis. Bioclipse is example software where multiple user types are targeted: the visually oriented human (that will use the graphical user interface (GUI), like the Bioclipse-OpenTox one), and people who want full control (and use a scripting language).

Once the user types are defined, we can start think about data flow and how to model that. It is important here to found a common ground and that underlying technologies are the same. That requires your design to be expressed in layers that build on top of each other (e.g. as done in the TCP/IP and OSI network stacks). Multiple applications oriented at multiple user types must use the same lower layers. Some initial agreements about what such a layered approach looks like for you project is important too.

Now, we're not done yet. There is the learnability aspect of usability. That is often neglected, and the discussion often only focuses on the easy of use. Bioclipse is based on Eclipse and they have several approaches for learnability, one we adopted in Bioclipse: cheat sheets (I think a great Open Standard!). They talk the user through a particular process, but at the same time link tightly to the software and they can even make things happen in the software, by running certain actions. This way, it teaches the users around in the design.

I personally like scripting very much, hacker that I am. Just because of the learnability aspect of HCI. Scripts are not for everyone, but for those who know a bit about programming, scripts are a perfect tool to teach others about how your product works. This is why projects like MyExperiment exist: to share scripts (and workflows of course, but those are just graphical scripts). The are explicit, show what is happening, etc, and thus are the most informative means to get your message across. This is why my Groovy Cheminformatics book is full of scripts too. For GUIs, screencasts server pretty much the same role, but are much less interactive: you cannot pause a screenshot just to see what happens if you hit that other button at that exact same time, limiting the learnability of the solution.

As a final note, I will briefly return to Bioclipse, Jmol and layers. What Bioclipse and Jmol have in common is that they have a two-layer design (well, maybe more, but for the current argument I want to focus on two layers). The lower layer defines an API on top of which two applications are developed, both using the exact same underlying API: a GUI and a scripting language. Both Bioclipse and Jmol all GUI funtionality (or 90% at least) is expressed in terms of API calls. How that technically works, is a whole other story, but early on the developers of Bioclipse and Jmol decided that was a smart thing to do. In fact, both projects did not have this approach, and changed the design later, and the point here is that any new project should take advantage of that experience and express from the start:

  1. what are the targeted user types
  2. what is the layered model that is going to be used, to allow targeting all user type


  1. Have been working for the industry for a while and definitely can say: make TWO interfaces for modeling and analysis:
    1. Simple and neat interface, and workflow that user can paste and click with ease (default parameters needed). It's very complex to make such an interface (workflow), because different users have different experience but there are always an average user, that want to upload data and receive an simple answer - yes or no (the more more picturesque is answer is better).
    2. There 20% (Pareto principle) of the users that want to make more complicated analysis, modify parameters, etc - provide them an interface under the simple paste-an-click interface under the button - Advanced parameters.

  2. Hi Vladimir, thanx for sharing your experience!

    Two interfaces sounds like a perfectly sane approach. As a elaboration, not critique, I observe that interfaces can also support both user types in one go. This is why I like HTML+RDFa so much. In my blog there are a few examples where this is used. The HTML with its full richness (CSS, JavaScript, etc) allows making a simple and neat interface, while the RDFa ensures access to the raw numbers within the same file.

    This is the approach I am using the the Cheminformatics Classics project (which indeed can use some further work :).

    The same kind of approach is visible in the Scholarly HTML:

    Depending on your exact use case, a single interface or separate interfaces may have preference.