Try this c1ccc2c(c1)CC4NCCc3cccc2c34.
Pages
▼
Monday, June 27, 2011
AMBIT's SMILES depict service
We all know Daylight's Depict service, right? But did you also know the AMBIT version (doi:10.1186/1758-2946-3-18, by IdeaConsult Ltd), which also uses the CDK (Open Source) and Xemistry's CACTVS toolkit (free for academic use)?
git reflog: so, what did just happen?!
Chris Aniszczyk made me aware of git reflog. I am not sure I grasp the full power yet, but I already like the fact that it shows me what I just did. For example, this is my latest output:
What I did on these steps was to apply part of a bug fix patch by Dmitry (EPO, The Hague). In fact, the patch actually fixed two separate problems. One part fixed a NullPointerException in code by me, and that code looks fine; the other part is in code not written by me, and I cannot oversee the consequences of that patch. A unit test would help, and so would a review by the original author.
So, the goal was to apply part of the patch. So, I downloaded Dmitry's patch from GitHub (see, GitHub Tip: download commits as patches, and the hash in the bug report). Then I applied it to my local repositort (5f8ecf9, see the above list). I undid the patch with 'git reset HEAD~1' (45b67e5) to make the patch unstaged. The I committed the two patch parts separately (cf399c8 and 1fa4bb9). I then rebased on origin/cdk-1.4.x to ensure my cdk-1.4.x branch is up to date (45b67e5, and 534dc09 I guess). Then I signed off the part of the patch that I can review (280381d, and bce19a0?), and finally I updated from the upstream repository once more (280381d) and then applied a patch to add Dmitry in the Copyright header as co-author of this class (98e4cae, see also Making patches; Attribution; Copyright and License.).
Now, apparently, you can edit this log too... but I am not sure why you would go about doing that, nor what effect that will have on the commit history...
98e4cae HEAD@{0}: commit: Added copyright owner line for previous patch
280381d HEAD@{1}: origin/cdk-1.4.x: updating HEAD
bce19a0 HEAD@{2}: am: Fixed potential NPE.
280381d HEAD@{3}: commit (amend): Fixed potential NPE.
534dc09 HEAD@{4}: am: Fixed potential NPE.
45b67e5 HEAD@{5}: origin/cdk-1.4.x: updating HEAD
1fa4bb9 HEAD@{6}: commit: Fixed potential NPE.
cf399c8 HEAD@{7}: commit: Fixed potential NPE.
45b67e5 HEAD@{8}: HEAD~1: updating HEAD
5f8ecf9 HEAD@{9}: am: Fixed potential NPE.What I did on these steps was to apply part of a bug fix patch by Dmitry (EPO, The Hague). In fact, the patch actually fixed two separate problems. One part fixed a NullPointerException in code by me, and that code looks fine; the other part is in code not written by me, and I cannot oversee the consequences of that patch. A unit test would help, and so would a review by the original author.
So, the goal was to apply part of the patch. So, I downloaded Dmitry's patch from GitHub (see, GitHub Tip: download commits as patches, and the hash in the bug report). Then I applied it to my local repositort (5f8ecf9, see the above list). I undid the patch with 'git reset HEAD~1' (45b67e5) to make the patch unstaged. The I committed the two patch parts separately (cf399c8 and 1fa4bb9). I then rebased on origin/cdk-1.4.x to ensure my cdk-1.4.x branch is up to date (45b67e5, and 534dc09 I guess). Then I signed off the part of the patch that I can review (280381d, and bce19a0?), and finally I updated from the upstream repository once more (280381d) and then applied a patch to add Dmitry in the Copyright header as co-author of this class (98e4cae, see also Making patches; Attribution; Copyright and License.).
Now, apparently, you can edit this log too... but I am not sure why you would go about doing that, nor what effect that will have on the commit history...
Sunday, June 26, 2011
Recover experimental data from a heat map?
I asked this question on BioStar too and people were kind enough to point out that ideally, I could just email the corresponding author and get the data. Ideally, the data would have been available from the supplementary information anyway. It is also pointed out that text-mining is inaccurate, and I would not know the units and/or transformations. Obviously, my bad for not adding these things in the question; they felt besides the point.
Well, I guess it is a win for science that people think Open Data is the norm already :)
Saturday, June 25, 2011
From the archives: my ICCS 2005 poster
Julio and Gert placed their ICCS 2011 work online, and today I was going through old CDs (see From the archives: Chemical Web, and the CDK in 2004 and Chiral Molecules: how cool is the SEM picture?). I also ran into my ICCS 2005 poster, and because that too was before I started blogging, I never posted it online. So, here it is, based on my thesis:
Chiral Molecules: how cool is the SEM picture?
I just found my student thesis in Organic Chemistry from my Nijmegen education. It's in Dutch, but I'll explore if I can upload this to Radboud University's DSpace. But I could not resist sharing this nice scanning electron microscope picture :) Look at those amphiphiles show a nice chiral ribbon!
This disk also has quite a few raw spectra (as TIFF images). I'll try figure out what to do with those. Uploading as Open Data to ChemSpider is tempting, but I want to make sure I can easily have people download the collection too (read: programmatically).
This disk also has quite a few raw spectra (as TIFF images). I'll try figure out what to do with those. Uploading as Open Data to ChemSpider is tempting, but I want to make sure I can easily have people download the collection too (read: programmatically).
From the archives: Chemical Web, and the CDK in 2004
I am working my way through an enormous pile of CDs, DVDs, both a mix of RO and RW disks, selecting those I will throw away, like old GParted, KNOPPIX and Debian install disks, as well as a few legal copies of old Microsoft software (like a Win98 boot disk). I also found a disk with two presentations I gave in 2004. They are fun to read. The one I gave at ExemplarChem is a bit sad, as I presented stuff there I developed even before 2004, which is still not common ground today :/
Also note the mention of DADML, something I did for the Woordenboek Organische Chemie, to standardize the access of remote database... well, let's hope I can find my Qiwi presentation in Washington in 2000 too. Damn... I keep amaze myself. [/sarcasm].
BTW, I also passed a CD with quite a bit of software that was around in the late nineties. Quite a few interesting things. A shame I cannot share this, because it was not Open Source :/
Also note the mention of DADML, something I did for the Woordenboek Organische Chemie, to standardize the access of remote database... well, let's hope I can find my Qiwi presentation in Washington in 2000 too. Damn... I keep amaze myself. [/sarcasm].
BTW, I also passed a CD with quite a bit of software that was around in the late nineties. Quite a few interesting things. A shame I cannot share this, because it was not Open Source :/
Sunday, June 19, 2011
CDK 1.4.0 release blockers
OK, after an hour or two of happily browsing through our SourceForge bug tracker, and the Nightly reports, I identified 20 release blockers that I like to be fixed before the CDK 1.4.0 release. Some are easy ones, but I may find one or two more later on. Others may report new problems too. But overall: doable.
CDK 1.3.12: the changes
I have uploaded CDK 1.3.12 to SourceForge. This is an important milestone release, as it contains the last bit of CDK-JChemPaint code to render molecules. Now for real :) It also contains the new volume descriptor.
The milestone bit is in the fact that with the addition of this extra bit of CDK-JChemPaint, the CDK 1.4.x series is now feature complete, moving it into freeze mode. This means for the development the following two things (in short): 1. no API changes are allowed, and 2. new functionality requires double reviewing.
This freeze means practically that the next weeks, we'll mostly see small clean up work, partly in the build system, partly in JavaDoc fixes, etc, and hopefully a few bug fixes for the open list of bug reports. It also means that continued development of the CDK 1.2.x series has come to a stop, and that we will likely see a CDK 1.5.0 release soon too, initiating the next development cycle.
What CDK 1.6 will bring? Hopefully, the CDK-JChemPaint editing functionality, perhaps alternative aromaticity models (see the cdk-devel mailing list), and hopefully more code from CDK-based products like AMBIT, PaDEL, ScaffoldHunter, and Craft. The challenge here is to find and then port patches back into the CDK back into the main library.
The milestone bit is in the fact that with the addition of this extra bit of CDK-JChemPaint, the CDK 1.4.x series is now feature complete, moving it into freeze mode. This means for the development the following two things (in short): 1. no API changes are allowed, and 2. new functionality requires double reviewing.
This freeze means practically that the next weeks, we'll mostly see small clean up work, partly in the build system, partly in JavaDoc fixes, etc, and hopefully a few bug fixes for the open list of bug reports. It also means that continued development of the CDK 1.2.x series has come to a stop, and that we will likely see a CDK 1.5.0 release soon too, initiating the next development cycle.
What CDK 1.6 will bring? Hopefully, the CDK-JChemPaint editing functionality, perhaps alternative aromaticity models (see the cdk-devel mailing list), and hopefully more code from CDK-based products like AMBIT, PaDEL, ScaffoldHunter, and Craft. The challenge here is to find and then port patches back into the CDK back into the main library.
CDK Module dependencies #3
Jonathan is here with me to work on his fingerprint project. He asked about CDK modules, which we use to control dependencies, within the CDK, as well as from the CDK on top of third-party libraries. I wrote up previously this about it:
The last overview of CDK module dependencies is a bit outdated. It is easy to recreate from the source code repository, using BeanShell and Graphviz with something like:
The current master gives this diagram:
(The sinchi module no longer exists, but clearly is still picked up from somewhere :)
It is also worth noting how this modularization is defined. We use JavaDoc for this, and in particular by adding a @cdk.module tag to the class JavaDoc, which is explained in this CDK News paper.
- Parallel building the CDK
- Maintaining the JChemPaint-Primary patch
- CDK Module dependencies #2
- UML diagram of CDK module dependencies
The last overview of CDK module dependencies is a bit outdated. It is easy to recreate from the source code repository, using BeanShell and Graphviz with something like:
$ export CLASSPATH=jar/jgrapht-0.6.0.jar $ bsh tools/deptodot.bsh --cdkLibs > cdk.dot $ dot -Tpng -O cdk.dot
The current master gives this diagram:
(The sinchi module no longer exists, but clearly is still picked up from somewhere :)
It is also worth noting how this modularization is defined. We use JavaDoc for this, and in particular by adding a @cdk.module tag to the class JavaDoc, which is explained in this CDK News paper.
Friday, June 17, 2011
Fast Calculation of van der Waals Volume as a Sum of Atomic and Bond Contributions
I was recently asked about a volume descriptor in Bioclipse, which is not yet available. Jmol can calculate surfaces, so that was my first thought. However, I then ran into a paper from 2003 by Zhao, called Fast Calculation of van der Waals Volume as a Sum of Atomic and Bond Contributions and Its Application to Drug Compounds (doi:10.1021/jo034808o).
The paper presents a very simple mathematical model, which approximates the molecular volume by a sum of atomic contributions, and a three terms to correct for atom-atom overlap, via the number of bonds, and corrections based on the number or aromatic and non-aromatic rings. The paper is clearly written, and the mathematics simple.
One problem with the publication though, are the numbers in the main text. They are wrong. I started of using the coefficients of the equations presented in the paper, but very soon ran into problems when I was writing up unit tests based on the volumes for compounds given as examples. In fact, the numbers in the main text are internally inconsistent. Not good. I believe it is partly caused by rounding, but that does not correct for the differences fully.
Fortunately, the Excel sheet in the supplementary information has the exact numbers, and those are numerically consistent.
The paper has been cited 46 times now, so, a fast volume descriptor seems relevant indeed. I am not sure how fast it will propagate to Bioclipse, as I do not have time soon to update the CDK version of Bioclipse (the major part of which is to ensure the Bioclipse-JChemPaint editor does not get broken, again).
Another thought about this paper, is that it is using the evil aromaticity concept, where the authors forgot to mention when they consider a ring to be aromatic.
Zhao, Y., Abraham, M., & Zissimos, A. (2003). Fast Calculation of van der Waals Volume as a Sum of Atomic and Bond Contributions and Its Application to Drug Compounds The Journal of Organic Chemistry, 68 (19), 7368-7373 DOI: 10.1021/jo034808o
The paper presents a very simple mathematical model, which approximates the molecular volume by a sum of atomic contributions, and a three terms to correct for atom-atom overlap, via the number of bonds, and corrections based on the number or aromatic and non-aromatic rings. The paper is clearly written, and the mathematics simple.
One problem with the publication though, are the numbers in the main text. They are wrong. I started of using the coefficients of the equations presented in the paper, but very soon ran into problems when I was writing up unit tests based on the volumes for compounds given as examples. In fact, the numbers in the main text are internally inconsistent. Not good. I believe it is partly caused by rounding, but that does not correct for the differences fully.
Fortunately, the Excel sheet in the supplementary information has the exact numbers, and those are numerically consistent.
The paper has been cited 46 times now, so, a fast volume descriptor seems relevant indeed. I am not sure how fast it will propagate to Bioclipse, as I do not have time soon to update the CDK version of Bioclipse (the major part of which is to ensure the Bioclipse-JChemPaint editor does not get broken, again).
Another thought about this paper, is that it is using the evil aromaticity concept, where the authors forgot to mention when they consider a ring to be aromatic.
Zhao, Y., Abraham, M., & Zissimos, A. (2003). Fast Calculation of van der Waals Volume as a Sum of Atomic and Bond Contributions and Its Application to Drug Compounds The Journal of Organic Chemistry, 68 (19), 7368-7373 DOI: 10.1021/jo034808o
Tuesday, June 14, 2011
Importing Nanotoxicity Data with SPARQL into R for analysis
Not so long ago I wrote about [i]mporting RDF input in R for analysis. I am collecting nanotoxicology data in a Semantic MediaWiki with the RDFIO extension installed (by Samuel), allowing me to SPARQL that data directly from R. There is nothing much structural to visualize at this moment, so I'm skipping the Bioclipse intermediate. I did show some visualization of the data itself in the wiki, earlier this week.
Anyway, release 1.2 of rrdf is on its way, adding a sparql.remote method for running SPARQL queries at remote repositories. It also has a patch by Ryan Kohl, to support CONSTRUCT-like SPARQL queries.
I haven't aligned my wiki with any ontology yet, so the properties have SMW-like resource form, which makes the SPARQL a bit weird looking. Other than that, the code to pull in nanotoxicology data from my data notebook now looks like:
Which results in a data matrix that looks like (mind you, this matrix is numeric, needing a bit of rrdf 1.3 functionality):
So, now it is time for some PCA.
Anyway, release 1.2 of rrdf is on its way, adding a sparql.remote method for running SPARQL queries at remote repositories. It also has a patch by Ryan Kohl, to support CONSTRUCT-like SPARQL queries.
I haven't aligned my wiki with any ontology yet, so the properties have SMW-like resource form, which makes the SPARQL a bit weird looking. Other than that, the code to pull in nanotoxicology data from my data notebook now looks like:
library(rrdf)
endpoint = "http://127.0.0.1/mediawiki/index.php/Special:SPARQLEndpoint"
query = paste("PREFIX w: ",
"SELECT ?min ?max ?zeta WHERE ",
"{ ?inst a w:Category-3AMetalOxides . ",
" OPTIONAL { ?inst w:Property-3AHas_Size_Min ?min . }",
" OPTIONAL { ?inst w:Property-3AHas_Size_Max ?max . }",
" OPTIONAL { ?inst w:Property-3AHas_Zeta_potential ?zeta . }",
"}"
);
data = sparql.remote(endpoint, query)
Which results in a data matrix that looks like (mind you, this matrix is numeric, needing a bit of rrdf 1.3 functionality):
> data
min max zeta
[1,] 15 90 NA
[2,] 15 90 NA
[3,] 15 90 NA
[4,] 15 90 NA
[5,] 15 90 NA
[6,] 15 90 NA
[7,] 15 90 NA
[8,] 15 90 NA
[9,] 15 90 NA
[10,] 15 90 NA
[11,] 15 90 NA
[12,] 15 90 NA
[13,] 10 100 34.2
[14,] 30 60 -17.3
[15,] 20 30 1.8
[16,] 15 90 NA
So, now it is time for some PCA.
Sunday, June 12, 2011
CDK-JChemPaint #7: rendering molecules as SVG
A very long time ago, Scalable Vector Graphics promised to revolutionalize images on the web. After initial cool work (including CMLSnap: animated chemical reactions by Peter's group!), things cooled down. There was simply a lack of support in browsers. Things have changed. SVG is much better supported now, and people are starting to use SVG again. Like Noel, visualizing 100 molecules in one blog post.
The new CDK-JChemPaint code has been refactored such that the original code for the core functionality is now independent from the drawing toolkit. And we have two well-developed implementations, one for Swing/AWT (used by the JChemPaint applet), and one for SWT (used by Bioclipse). And there is one that generates SVG too, written by Gileain as a proof of principle.
The code is almost identical to the code for rendering molecules as PNG. We just swap the AWTDrawVisitor for the SVGGenerator:
Additionally, we need to change how we output the results. The below code generate the SVG and the matching HTML snippet:
The result looks like:
The new CDK-JChemPaint code has been refactored such that the original code for the core functionality is now independent from the drawing toolkit. And we have two well-developed implementations, one for Swing/AWT (used by the JChemPaint applet), and one for SWT (used by Bioclipse). And there is one that generates SVG too, written by Gileain as a proof of principle.
The code is almost identical to the code for rendering molecules as PNG. We just swap the AWTDrawVisitor for the SVGGenerator:
-renderer.paint(triazole, new AWTDrawVisitor(g2)); +svgGenerator = new SVGGenerator(); +renderer.paint(triazole, svgGenerator);
Additionally, we need to change how we output the results. The below code generate the SVG and the matching HTML snippet:
new File("triazole.svg").append(svgGenerator.getResult())
file = new PrintWriter(new FileWriter(new File("triazole.html")))
file.println("<html>");
file.println("<body>");
file.print("<embed width=\"100\" height=\"100\" src=\"triazole.svg\" />");
file.println("</body>");
file.println("<html>");
file.close()
The result looks like:
Saturday, June 11, 2011
CDK 1.3.11: the changes, the authors, and the reviewers
- Hej, wait you sneaky bastard! You just released 1.3.10! You can overdo release often too, you know!
- CDK-JChemPaint #1: rendering molecules
- CDK-JChemPaint #2: rendering reactions (not part of renderbasic!)
- CDK-JChemPaint #3: rendering parameters (how to change the looks)
- CDK-JChemPaint #4: embedding the renderer in a Swing panel
- CDK-JChemPaint #5: the Groovy-JChemPaint repository (described where to find the latest examples)
- CDK-JChemPaint #6: rendering atom numbers
IMPORTANT! While important functionality got included, not everything is there yet. In particular, for Swing/AWT support, we still need to include the renderawt module. That one too, needs some further work. It misses a few unit tests, needs a bit more JavaDoc, and bits of code clean up. In short, not all functionality you can use yet with purely CDK 1.3.11. I should have communicated that more clearly. My apologies.
The Authors
This is the result of hard work from Niels Out, Stefan Kuhn, Arvid Berg, Mark Rijnbeek, Gilleain Torrance and me (and as such, a joint project between the groups in Uppsala, at the EBI, and myself). There are also occasional constributions by others that are not to be forgotten!The Reviewer
Many thanx to Rajarshi for reviewing the patch, giving good comments, and approving it in the end, despite a few remaining shortcomings. (Bug reports welcome! :)
CDK 1.3.10: the changes, the authors, and the reviewers
Release 1.3.10 is not much different from 1.3.9, as we are seriously converging towards CDK 1.4.0 now, with only the big CDK-JChemPaint renderbasic patch waiting. I'm hoping to merge that in this weekend, and to release the first Release Candidate then. The 2nd edition of the Groovy Cheminformatics book, in fact, is already based on 1.3.10, which I released a few days ago. This release has the following changes, mostly contained bug fixes in the 1.2 series:
The Authors
- Fixed dependency on specific molecule impl, so now we use IAtomContainer rather than Molecule fbdb989
- Added a convenience test to see if a parameter has been registered to the model ad04b69
- Updated JavaDoc checking to OpenJavaDocCheck 0.8 7bfb28b
- Copy data files into the right folder of the puredist ae69740
- Added a test to demonstrate the ClassPathException in bug #3305581 d8c4ca6
- Test if the descriptor results are the same for two implementations (unfortunately, the nonotify IMolecule extends the data IMolecule, so it does not catch bug #3305581) 0f3f147
- Factored out a method to test whether to descriptor calculations give the same results 861c304
- Parameterized method to create water to allow alternative implementations 4d7c6b5
- Updated code to avoid dependency on specific implementation of molecule object. Now use IAtomContainer 7cc943a
- Added detection of a Te atom type, found in ChEMBL d114114
- Added unit tests^Cor Te.3 atom type detection aa79a3f
- Added note about required atom type perception 54d5ff9
- Added unit tests to calculate tautomers from a handcrafted IAtomContainer 310cd0c
- Fixed OJDCheck validation 67d76d0
The Authors
13 Egon Willighagen 2 Rajarshi Guha 1 Onkar ShindeThe Reviewers
5 Rajarshi Guha 2 Jonathan Alvarsson 2 Mark Rynbeek 2 Egon Willighagen
Groovy Cheminformatics 2nd edition
Update: the fourth edition is out.
OK, I wrapped up the content, mostly finalized last week, after making a few small changes, created a new cover (rather than using a Lulu template), and uploaded things to Lulu:
The full Table of Contents is available as 'preview' on this Lulu page.
OK, I wrapped up the content, mostly finalized last week, after making a few small changes, created a new cover (rather than using a Lulu template), and uploaded things to Lulu:
New content includes:
- Section 2.3.3: Molecular Formula
- Section 2.6: IRings
- Section 7.3: Graph matrices
- Chapter 10: Molecular properties (mass, TPSA, XLogP)
- Chapter 11: InChI
- Section 15.2: CDK 1.0 to 1.2
- Appendix A: Atom Type Lists
- Appendix B: CDK Authors
The full Table of Contents is available as 'preview' on this Lulu page.
Friday, June 10, 2011
Assessed if I could recommend the Mendeley plugin for OpenOffice
Our institute started using a Mendeley group for its publications recently. And a lot of my colleagues are using Word and EndNote. I use neither. My personal workflow includes LaTeX, BibTeX, and since recently BibLaTeX, and CiteULike (all content mirrored to Mendeley). And recent talk by Benjamin, I decided to give the Mendeley plugin for OpenOffice a go (in LibreOffice, in fact). It does what it needs to do. I am not sure yet how to customize the display (the equivalent of, for example, unsrt), but not so worried about that personally. This screenshot shows what my test looked like.
Update: Steve explained in the comments that picking the right CSL style does the job. The list of additions shows a download for BMC Bioinformatics which will also work for other BMC journals like the J. Cheminformatics, I assume. The result after making that CSL the default:
Update: Steve explained in the comments that picking the right CSL style does the job. The list of additions shows a download for BMC Bioinformatics which will also work for other BMC journals like the J. Cheminformatics, I assume. The result after making that CSL the default:
Thursday, June 09, 2011
Plotting RDF data with a Semantic Media Wiki
I was not aware of that earlier, but data you have present in a semantic form in MediaWiki can be plotted using jqplot.
The wiki source for the plot in this screenshot looks like:
{{#ask: [[Category:Measurements]] [[Has Study::{{PAGENAME}}]] [[Has Endpoint::PercentageNonViableCells]]
| ?Has Endpoint Value
| format=jqplotbar
| sort=Has Endpoint Value
| order=ascending
| height=250
| width=600
}}
I am wondering if there is a {{#sparql: equivalent that I can use.
The wiki source for the plot in this screenshot looks like:
{{#ask: [[Category:Measurements]] [[Has Study::{{PAGENAME}}]] [[Has Endpoint::PercentageNonViableCells]]
| ?Has Endpoint Value
| format=jqplotbar
| sort=Has Endpoint Value
| order=ascending
| height=250
| width=600
}}
I am wondering if there is a {{#sparql: equivalent that I can use.
Monday, June 06, 2011
Groovy Cheminformatics 2nd edition soon
In February I released the first edition of my book about writing cheminformatics software in the Groovy language using the CDK. The booklet was thinner than I expected for 72 pages (thin paper), which made the booklet look relatively expensive. Then again, it's not particularly making me rich. As explained before, I hope this will become a source of funding for continued CDK development. Anyways, today I worked hard to address some flaws in the first edition, and making some further tweaks.
The overall experience should be improved: I am now using the geometry package which should address the lack of whitespace near the top of each page, thanx to Jason Brownlee. I also noted that the bibliography styles abbrev and unsrt cannot be combined, and because this TeX StackExchange answer mentioned biblatex which I read about several times now, I decided it was time to dive in. I haven't gone the full biber route yet, and still using the CiteULike group for this book. I also hacked up automatic wrapping of output from the Groovy scripts in the book, which should further clean up the design. I doubt it is up to Jonathan's standards yet, but it will have to do for now.
On the content side, there are also interesting changes, and in particular the new sections and chapters. Per request, a list of all people who contributed to the CDK is added, as well as an overview of all CDK atom types. New material includes a short discussion on the IRing interface, a chapter on the InChI, words about generating tautomers (thanx to Mark for this new code!), examples on how to calculate various graph matrices, molecular formula, and how to calculate XLogP and TPSA properties.
All in all, the booklet now sums up to 104 pages, whereas the first version had 72. But, it's much too late already, and the alarm goes way too early in the morning, so the new edition will not appear online today.
Oh, and thanx to all who bought a copy of the first edition!
The overall experience should be improved: I am now using the geometry package which should address the lack of whitespace near the top of each page, thanx to Jason Brownlee. I also noted that the bibliography styles abbrev and unsrt cannot be combined, and because this TeX StackExchange answer mentioned biblatex which I read about several times now, I decided it was time to dive in. I haven't gone the full biber route yet, and still using the CiteULike group for this book. I also hacked up automatic wrapping of output from the Groovy scripts in the book, which should further clean up the design. I doubt it is up to Jonathan's standards yet, but it will have to do for now.
On the content side, there are also interesting changes, and in particular the new sections and chapters. Per request, a list of all people who contributed to the CDK is added, as well as an overview of all CDK atom types. New material includes a short discussion on the IRing interface, a chapter on the InChI, words about generating tautomers (thanx to Mark for this new code!), examples on how to calculate various graph matrices, molecular formula, and how to calculate XLogP and TPSA properties.
All in all, the booklet now sums up to 104 pages, whereas the first version had 72. But, it's much too late already, and the alarm goes way too early in the morning, so the new edition will not appear online today.
Oh, and thanx to all who bought a copy of the first edition!
Thursday, June 02, 2011
Productivity Tool: search bar for any JavaDoc HTML
While searching for a way to hide certain Java packages from the standard HTML JavaDoc output, I ran into a nifty tool: an extension for Chrome and Firefox (this one is in fact a userscript, which we use in life sciences too):
You can type a query in the search field, and that content will be filtered accordingly. Once the userscript or extension is installed, it will work on any JavaDoc HTML.
You can type a query in the search field, and that content will be filtered accordingly. Once the userscript or extension is installed, it will work on any JavaDoc HTML.
Wednesday, June 01, 2011
Bringing OpenSource to the public: OpenTox in Africa
Barry is this week in Africa to demo OpenTox, and took along a VirtualBox appliance with OpenTox REST and ontology servers and Bioclipse preinstalled, and installed things on several machines from local participants. Bringing open source software to potential users is becoming easier every day! Well done to Roman and Nina for creating the appliance and to Barry for getting scientists in Africa into predictive toxicology. (Yes, Africa is very big, but I have been too lazy to look up where he exactly went :)








