Friday, October 03, 2014

Jenkins-CI: automating lab processes

Our group organizes public Science Cafes where people from Maastricht University can see the research it is involved in. Yesterday it was my turn again, and I gave a presentation showing the BiGCaT and eNanoMapper Jenkins-CI installations (set up by Nuno) which I have been using for a variety of processes which Jenkins conveniently runs based on input it gets.

For example, I have it compile and run test suits for a variety of software projects (like the CDK, NanoJava), but also have it build R packages, and even daily run Andra Waagmeester's code to create RDF for WikiPathways. And the use of Jenkins-CI is not limited to dry lab processes: Ioannis Moutsatsos recently showed nice work at Novartis that uses Jenkins for high-throughput screening and data/image analysis.

Thursday, September 25, 2014

Slides at the Open PHACTS community workshop (June 26)

First MSP graduates.
It seems had not posted my slides yet of the presentation at the 6th Open PHACTS community workshop. At this meeting I gave an overview of the Programming in the Life Sciences course we give to 2nd and 3rd year students of the Maastricht Science Programme (MSP; some participants graduated this summer, see the photo on the right side).

This course will again be given this year, starting in about a month from now, and I am looking forward to all the cool apps the students come up with! Given that the Open PHACTS API has been extended with pathways and disease information, they will likely be even cooler than last year.

OpenTox Europe 2014 presentation: "Open PHACTS: solutions and the foundation"

CC-BY 2.0 by Dmitry Valberg.
Where the OpenTox Europe 2013 presentation focused on the technical layers of Open PHACTS, this presentation addressed a key knowledge management solution to scientific questions and the Open PHACTS Foundation. I stress here too, as in the slides, that the presentation is on behalf of the full consortium!

For the knowledge management, I think Open PHACTS did really interested work in the field of "identity" and am happy to have been involved in this [Brenninkmeijer2012]. The platform implementation is, furthermore, based on the BridgeDb platform, that originated in our group [VanIersel2010]. The slides outline the scientific issues addressed by this solution:

PS, many more Open PHACTS presentations are found here.

Brenninkmeijer, C. et al. Scientific lenses over linked data: An approach to support task specific views of the data. a vision. In Linked Science 2012 - Tackling Big Data (2012). URL

van Iersel, M. et al. The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services. BMC Bioinformatics 11, 5+ (2010). URL

Tuesday, September 16, 2014

Do a postdoc with eNanoMapper

CC-BY-SA from Zherebetskyy @ WP.
Details will still have to follow as they are being worked out, but with Cristian Munteanu having accepted an associate professorship, I need a new postdoc to fill his place, and I am reopening the position I had almost a year ago. Do you like to works in a systems biology group (BiGCaT), are pro Open Science, like to work on tools for safe-by-design nanomaterials, and have skills in one or more of bioinformatics, chemoinformatics, statistics, coding, ontologies, then this position may be something for you.

The primary project for this position is eNanoMapper and you will be working within the large European NanoSafety Cluster network, though interactions are not limited to the EU.

If you have interest and cannot wait until the details of the position come out, please send me an email. first.lastname @ General questions about eNanoMapper and our BiGCaT solutions for nanosafety are also welcome in the comments.

Sunday, September 14, 2014

CDK: Element and Isotope information

When reading files the format in one way or another has implicit information you may need for some algorithms. Element and isotope information is a key example. Typically, the element symbol is provided in the file, but not the mass number or isotope implied. You would need to read the format specification what properties are implicitly meant. The idea here is that information about elements and isotopes is pretty standardized by other organizations such as the IUPAC. Such default element and isotope properties are exposed in the CDK by the classes Elements and Isotopes. I am extending my Groovy Cheminformatics with the CDK with these bits.

The Elements class provides information about the element's atomic number, symbol, periodic table group and period, covalent radius and Van der Waals radius, and Pauling electronegativity (Groovy code):

Elements lithium = Elements.Lithium
println "atomic number: " + lithium.number()
println "symbol: " + lithium.symbol()
println "periodic group: " +
println "periodic period: " + lithium.period()
println "covalent radius: " + lithium.covalentRadius()
println "Vanderwaals radius: " + lithium.vdwRadius()
println "electronegativity: " + lithium.electronegativity()

For example, for lithium this gives:

atomic number: 3
symbol: Li
periodic group: 1
periodic period: 2
covalent radius: 1.34
Vanderwaals radius: 2.2
electronegativity: 0.98

Similarly, there is the Isotopes class to help you look up isotope information. For example, you can get all isotopes for an element or just the major isotope:

isofac = Isotopes.getInstance();
isotopes = isofac.getIsotopes("H");
majorIsotope = isofac.getMajorIsotope("H")
for (isotope in isotopes) {
  print "${isotope.massNumber}${isotope.symbol}: " +
    "${isotope.exactMass} ${isotope.naturalAbundance}%"
  if (majorIsotope.massNumber == isotope.massNumber)
    print " (major isotope)"
  println ""

For hydrogen this gives:

1H: 1.007825032 99.9885% (major isotope)
2H: 2.014101778 0.0115%
3H: 3.016049278 0.0%
4H: 4.02781 0.0%
5H: 5.03531 0.0%
6H: 6.04494 0.0%
7H: 7.05275 0.0%

This class is also used by the getMajorIsotopeMass() method in the MolecularFormulaManipulator class to calculate the monoisotopic mass of a molecule:

molFormula = MolecularFormulaManipulator
println "Monoisotopic mass: " +

The output for ethanol looks like:

Monoisotopic mass: 46.041864812