Saturday, August 20, 2011

Cleaner CDK Code #10: clean patches

The CDK code base is not just a regular dump of Java source code; it is an annotated dump of Java source code. You might have heard about git blame, and if you did, this would be a good time to start reading up on git, e.g. using this great book: Git from the bottom up. However, that will not tell you about git blame, but The Git Community Book will, and the man page will give you all the details.

We take advantage of the history of a file, as it helps us understand the full picture, complementing JavaDoc, inline comments, proper variable names, etc, etc. The annotation links each code line to a commit message. And that also explains that CDK reviewers are strong on good commit messages. No useless messages like 'fixed a bug', but a message that actually describes what has been fixed, and how. That's hard to do, but we are all increasingly trained twitterers, so we are trained to say much in 140 chars. Well, some are. So, I always hope to see something like "made the creation of morgan numbers N times faster, where N is the number of atoms in the AtomContainer" (like today).

What I do not like to see, is line changes that do not actually change something, for example, because they 'fix' whitespace. First, they ruin the git line annotation, by linking a random commit to a particular code line. Second, the reviewer does not know if the line has code changes, or just whitespace changes, and has to check the line in detail anyway. Waste of precious time, where code review is already quite a bottleneck in the CDK development process.

So, no stuff like this in your next patch please (it's extracted from a larger patch):

-        mol.addBond(0,1, CDKConstants.BONDORDER_SINGLE);
-        mol.addBond(1,2, CDKConstants.BONDORDER_DOUBLE);
-        mol.addBond(2,3, CDKConstants.BONDORDER_SINGLE);
-        mol.addBond(3,4, CDKConstants.BONDORDER_DOUBLE);
-        mol.addBond(4,0, CDKConstants.BONDORDER_SINGLE);
+        mol.addBond(0, 1, CDKConstants.BONDORDER_SINGLE);
+        mol.addBond(1, 2, CDKConstants.BONDORDER_DOUBLE);
+        mol.addBond(2, 3, CDKConstants.BONDORDER_SINGLE);
+        mol.addBond(3, 4, CDKConstants.BONDORDER_DOUBLE);
+        mol.addBond(4, 0, CDKConstants.BONDORDER_SINGLE);

Update: It's fine to have whitespace changes as separate patches; if one knows only whitespace changes, that requires a different kind of reviewing. Just don't mix it with functional changes that require more in-depth reviewing.