Today, I came across a random class at the bottom of a file (you do have one class per file, don’t you?). This class had one static, readonly member on it, which was a regular expression used to determine whether or not an address was a post office box.*
public class SharedRegularExpressions
{
public static readonly Regex PostOfficeBoxRegex = new Regex
(@"^(?!.*P\.? ?O\.? ?Box)[-a-zA-Z\d .,/]*$",
RegexOptions.Compiled | RegexOptions.Singleline | RegexOptions.IgnoreCase);
}
It was pretty bizarre, so I proceeded to look for a better home for this small, surreptitious class… or at least the regular expression that it was exposing. The first thing I did was check all references to it. I discovered that it was used throughout the business and presentation layers of the application. I was rather shocked! Certainly, there was a common store for regular expressions.
However, I should not have been looking for a common place for regular expressions. I should have been looking at how this particular regular expression was being used. Let’s look at a few examples from the “Find All References” list.
Business\ValidatePhysicalAddresses.cs - (17, 71) : if (!string.IsNullOrEmpty(address.Street) && SharedRegularExpressions.PostOfficeBoxRegex.IsMatch(address.Street))
Business\ValidateAddressIsPhysicalAddress.cs - (40, 34) : if (SharedRegularExpressions.PostOfficeBoxRegex.IsMatch(address.Street) ||
Business\ValidateClientAddressIsPhysicalAddress.cs - (45, 46) : if (SharedRegularExpressions.PostOfficeBoxRegex.IsMatch(clientAddress.Street) ||
Presentation\CreateNewAccount.xaml.cs - (114, 30) : if (!SharedRegularExpressions.PostOfficeBoxRegex.IsMatch(address.Street))
Presentation\ManageAccounts.xaml.cs - (54, 52) : if (SharedRegularExpressions.PostOfficeBoxRegex.IsMatch(accountAddress.Street))
Presentation\ViewAccountInformation.xaml.cs - (99, 42) : !SharedRegularExpressions.PostOfficeBoxRegex.IsMatch(address.Street)).FirstOrDefault());
Notice a common theme here. Yes, the regular expression is used to evaluate a string to determine if it represents a street address of a post office box, but being a street, it is always pulled from an Address instance. Since the regular expression is always being used with the Address class, we can safely infer that this behavior actually belongs to the Address class.
public partial class Address
{
private static readonly Regex PostOfficeBoxRegex = new Regex
(@"^(?!.*P\.? ?O\.? ?Box)[-a-zA-Z\d .,/]*$",
RegexOptions.Compiled | RegexOptions.Singleline | RegexOptions.IgnoreCase);
public bool IsPostOfficeBox()
{
return PostOfficeBoxRegex.IsMatch(Street1);
}
}
Remember to strive for functional cohesion within your classes. The term may sound funny, but to reach functional cohesion you merely need to ask yourself, “where does this functionality belong?” It may not always be clear, but the truth is that if it is not clear then you need to decompose your methods (break them apart, they’re too long).
This tiny bit of code may seem rather innocuous, but too many of these can add up to become a maintenance nightmare. Before I refactored this code, a change from regex to another method of determining if an address refers to a post office box would have required changes throughout the system. By moving the behavior where it belonged, the functionality of determining if an address is a post office box is encapsulated and only requires modification in the one place it is implemented.
It is sometimes hard to know which is the correct principle to apply. In this case, there are two principles we are concerned with: strive for functional cohesion and don’t repeat yourself. These two principles are very important for writing clean, maintainable code. If you keep them in mind when determining the right course of action, you will almost certainly make the correct decision.
* Proprietary code has been modified. RegEx is from this post.