Archives For November 30, 1999

Making small commits often is usually preferable to making massive once-a-day commits – especially if you want someone to review your changes. For me, my stress levels rise with the number of files that have been touched since the last commit. When commits grow too big I’ve been known to stash the code, and redo the changes, breaking up what would have been a large commit into a number of smaller commits. Yes, sometimes large commits are inevitable. They happen. Things could be worse.

By worse, I mean like the two types of commits that have been bothering me on occasion lately.

The first is the one where the commit strays off topic, delving into multiple features at once. For example, you find a commit with the following description:

Added ATOM parsing to RssFeedReader
Validation of ExpirationDate for user account

OK, yeah, those two changes seem totally related. Or maybe they should have been two separate commits? Imagine that we wanted to merge one of these changes to another branch. The Single Responsibility Principle probably applies to commits as well.

However, there is another type that actually bothers me more: The overzealous drive-by code cleaning.

Imagine that you’re looking at the history of the UserAccount.cs file. One of the changes has the following commit message:

Added ATOM parsing to RssFeedReader

Why would RSS parsing affect the UserAccount class? You’re sitting there, scratching your head, wondering why a completely unrelated file has been touched. Upon closer inspection you discover what changed. One line. One unused using statement was removed.

Indeed, that is quite unrelated to adding ATOM parsing to an RSS feed reader. So not only has the history of the UserAccount.cs file been distorted, it was done for pretty much no business value whatsoever.

Now, maybe I’m nitpicking. By all means, clean up the code. But try to keep each commit focused and coherent, instead of going off fixing “problems” in unrelated files. Small fixes like adjusting spacing and removing unused using statements should really only be done if you already had another reason to fiddle with the file. Alternatively, do all the tiny fixes together in one go without changing anything else. Then you can just check the changes in with a suitable comment like “Removing unused using statements”. Accurate, succint and to the point. Clean code. Even cleaner source control history.

The Single responsibility principle states that each class should have a single reason to change. There are certainly many ways to deduce if a class has too many responsibilities, most of which actually require you to use your brain! However, a quick-and-dirty way to establish if a class has too many responsibilities is simply to look at the list of using statements at the top of the file (imports in Java and Visual Basic.NET). If you open the source file in your IDE of choice and the list of usings/imports fills your entire view, the class is likely to have a lot of reasons to change.

Here is what the list of using-directives in one of our classes looked like a few months ago:

using System;
using System.Collections;
using System.Collections.Generic;
using System.Data;
using System.IO;
using System.Linq;
using System.Net;
using System.Net.Mail;
using System.Net.Mime;
using System.Text;
using System.Text.RegularExpressions;

The likelihood that a class like this was developed using test-driven development seems slim (and indeed, it was not).

I don’t really believe that we can set a strict “maximum” number of using directives to allow, but probably anything more than six or seven should serve as an indication that perhaps the class in question could do with some refactoring. Perhaps we can use the number of usings/imports as a metric describing code complexity, albeit a very blunt one.