Pages

Wednesday, November 09, 2011

The simplest way to make CDK commits

Every now and then I people who show interested on working on the CDK. I reply to them what is involved, and I rarely here back from them. I know this is common for most open source projects (see also Community development), and for the CDK this is likely caused the cumbersome process of getting a full development environment set up. Over the next months, I will make an effort to extend my Groovy Cheminformatics book to include detail after detail on how to do this. But what would also be welcome is a VM (OVF) image that has everything set up and well.

Anyway, but the road to CDK commit fame does nowadays not require a full-fledged development environment. Instead, we have GitHub. Their web interfaces makes a lot of things easy, including source code peer review.

But in this post I would like to show how easy it is to fix small things in the CDK, by using the GitHub GUI. Of course, this post can be used for any project hosted on GitHub.

Step 1
Get a free GitHub account. (And log in.)

Step 2
Find a problem in the CDK. Start with something dead easy, like JavaDoc errors. For example, check the Nightly report for OpenJavaDocCheck errors here. These pages will return a lot of errors about missing documentation, but skip those. Do something really simple, like reports like this one:

There is no period to end the first sentence: 'Sums up the columns in a 2D int matrix'

JavaDoc has a special purpose with the first sentence in any JavaDoc: it serves as a summary. The detect the first sentence, it must properly end with a period.

That patch cannot get any easier. It just requires a missing period to be added.

Step 3
Identify the source file that contains the error. This has the added value in that you automatically learn your way around in the directory/folder hierarchy of the CDK project source. The above error refers to this class:

org.openscience.cdk.graph.PathTools

Now, all functional CDK code (that is, everything but the unit test suite) can be found in the source distribution under src/main, but we need the GitHub URL for that, and that is here (note that the linked OpenJavaDocCheck report is for the stable cdk-1.4.x branch, so our GitHub page for the PathTools source too):


Check this URL carefully, and note where it keeps the branch name, the src/main folder, and the path to the PathTools.java source. That makes finding other source code pages later easier. This particular page looks like:



Step 4
Now, this source code page has (when logged in) a 'Edit this file' icon right of the file name line. Click this icon, and GitHub will present you with a basic, in-browser editor:


I already scrolled down a bit, to the line with the missing period from this example. Make the modification, and scroll down to the lower part of the page, and read step 5.

Step 5
With the small fix done, it is time to make the actual commit. Below the editor there is a text field to enter a commit message (important: describe what you did, even if this takes more time than the fix itself! Reason: when browsing commits in changelogs, you only see those messages!):


If you have multiple JavaDoc fixes, put them in one commit. But, preferably do not mix them with other fixes, as to keep the commit message as well as the peer-review simple. That speeds up the reviewing process, and makes it easier for me and Rajarshi to apply to the main source tree, but more about that in the next steps.

Of course, this online editing can also be used for fixing PMD warnings, as reported by this Nightly report. However, keep in mind that you cannot recompile the code this way, and for code changes, this online approach is discouraged.

When done, press 'Propose File Change' (Rajarshi and I see a 'Commit Changes' button instead). After a new page is opened, the commit has been created, and it is time to inform us of your commit. This is done via a so-called 'pull request', as outlined in the next step.

Step 6
The last step in the process is to send out a pull request. A page to do this is normally the immediate result from hitting that 'Propose File Change' button, and should look something like the following (note that I could not make a screenshot based on the running CDK example, because I have commit rights, and the patch goes directly into the repository; I discovered that in this patch :):


So, while for another GitHub project (Total-Impact is worth checking out), this page should look similar. The top grey bar show the project name and the 'Send a pull request', confirming that this page does what we are expecting. In the blue box a comment is given on where your commit is stored, which is in your own fork of the CDK for your own GitHub account, in a branch called patch-x.

Below that blue box, reference is made to your newly-made commit, and a bit further below two text fields, a single line text box for a message 'subject' prefilled with the commit message, and a text box where you can leave a message to accompany the pull request. This message is used to put the pull request in perspective, and can be used to introduce yourself briefly, refer to a set of patches, or whatever. This message will not end up in the git repository. The more requests you make, the smaller this message will get. "Yeah, another JavaDoc fix."

Hit the green 'Send pull request' button, and you're done.