In LINQ is Better Than ForEach, I described how to use the reduce chain refactoring to better describe the functionality of the code. By encapsulating what we were doing, the calling code became more declarative, and the the functionality was more reusable.
This didn’t sit well with one reader:
Adding such extension methods to IEnumerable<T> as you suggest is IMHO a clear violation of class responsibility. The IEnumerable is a _collection_ class. It's purpose is to manage a list/set/bag/whatever of some objects, not to access or even worse change the contained objects' state. By polluting the IEnumerable<T> with "T-specific" extension methods (e.g. DeveloperNames()), you effectively break the class contract.
Example: your Employee class changed - Role field was deleted. This means your IEnumerable<Employee> is now also broken. Why - does removing anything from the object conceptually influence anything in managing a set of those objects? No, why should it. And yet the code is broken, only because the collection class was welded to much with the items it contains.
The Cake is a Lie
I added DeveloperNames() to IEnumerable<Employee>. The purpose of IEnumerable<Employee> is to provide an interface for the GetEnumerator() method. It isn’t even to manage a list, set, or bag of employees. The only thing it does is allow you to iterate over employees.
In fact, it’s all a lie. DeveloperNames() exists in a completely separate class.
public static class EnumerableEmployee
{
public static IEnumerable<string> DeveloperNames(this IEnumerable<Employee> employees)
{
return employees.Where(e => e.Role == Role.Developer)
.OrderBy(e => e.LastName)
.Select(e => e.FullName);
}
}
IEnumerable<Employee> is not a class, it’s an interface. I created a helper, or utility, class called EnumerableEmployee. The purpose of this class is to contain utility methods for IEnumerable<Employee>. The problem is that utility classes have logical cohesion. I desired, semantically at least, functional cohesion. By using the “this” keyword, I told the compiler to treat the first parameter as though it actually owned the method.
Principles
Extension methods adhere to the open/closed principle, which states that software entities should be open for extension, but closed for modification.” By their very nature, the classes or interfaces being extended are merely parameters into a static method.
The commenter asserts I am violating “class responsibility.” The single responsibility principle states that “every object should have a single responsibility, and that responsibility should be entirely encapsulated by the class.” As stated above, the only class is a utility class that provides syntactic sugar to objects that implement IEnumerable<Employee>. In adhering with the above open/closed principle, those objects are extended.
IEnumerable<Employee> still provides GetEnumerator(). Employee[] is still an array of Employee objects. List<Employee> is still a list of employees. They never changed. The coupling is semantic and applies where the utility class is in scope.
Where Does DeveloperNames() Belong?
The reduce chain refactoring calls for placing the resulting method (whether extension, static, or instance) in the same chain loop it originally belonged. If the original method chain was correct, this will lead to correct functional cohesion. If there is a problem, then the method should be moved, but there usually is an underlying problem to begin with.
In the case of DeveloperNames(), it started as a reduce chain on a LINQ to Objects query. Encapsulating has the following benefits:
1) makes it more declarative
2) makes it testable
3) makes it reusable with DRY
IEnumerable<Employee> represents any object that implements that interface… from an array to List<Employee> to a custom employee collection. A properly designed utility method would accept the most general interface, IEnumerable<Employee>, so any of those object could be passed in.
Polluting IEnumerable<T> with T-Specific Extension Methods
That’s been done with LINQ to Objects. One should be careful with IEnumerable<T>, since it everything can access it. Only the most general methods belong there, and most have been covered that I’m aware of. IEnumerable<[class]> should be readily extended though. The only caveat is I would try to avoid those extensions that cause side effects, if possible.
No Side Effects
The commenter states the IEnumerable<Employee> is broken, that the collection is too welded to the item. Since the code works fine, I can only assume that it is suggested the original collection is being modified.
The examples I used did not create side effects. Even after I refactored to the foreach version of DeveloperNames(), there were no side effects present, so the employee classes were not changed, and the employees instance was unaffected.
This is important, because one must be careful to only make deliberate side effects. In this example, no side effects were necessary to accomplish any task, so none were used.
var employees = new[]
{
new Employee{FirstName = "Chris", LastName = "Eargle",
Role = Role.Developer},
new Employee{FirstName = "Mr.", LastName = "Scrooge",
Role = Role.Manager},
new Employee{FirstName = "Steve", LastName = "Andrews",
Role = Role.Developer}
};
Console.WriteLine("Developer Names");
Console.WriteLine(String.Empty.PadRight(15, '-'));
foreach (var name in employees.DeveloperNames())
{
Console.WriteLine(name);
}
Console.WriteLine("\nAll Employees");
Console.WriteLine(String.Empty.PadRight(13, '-'));
foreach (var employee in employees)
{
Console.WriteLine(employee.FullName);
}
I ran the code with both versions of the extension method (which you can get from the original article), and the results were the same.
Developer Names
---------------
Steve Andrews
Chris Eargle
All Employees
-------------
Chris Eargle
Mr. Scrooge
Steve Andrews
I never figured out what was meant by the Role being removed. The original LINQ statement returned an IEnumerable<string>, and you would expect something similar from a method named DeveloperNames().
But still, I don’t like DeveloperNames()
My problem with DeveloperNames() isn’t for any of the reasons the commenter posted. I only used it as an example of encapsulation to maintain declarative code whether one was using LINQ or imperative code. If I were truly designing this, I believe I would make the syntax as natural as possible. I would prefer to call employees.Developers().Names(). This would also give me the ability to call employees.Names() to get the full list of names. Let’s try that out.
public static class EnumerableEmployee
{
public static IEnumerable<Employee> Developers(this IEnumerable<Employee> employees)
{
return employees.Where(e => e.Role == Role.Developer);
}
public static IEnumerable<string> Names(this IEnumerable<Employee> employees)
{
return employees.Select(e => e.FullName);
}
}
Here’s the calling code.
Console.WriteLine("Developer Names");
Console.WriteLine(String.Empty.PadRight(15, '-'));
foreach (var name in employees.Developers().Names())
{
Console.WriteLine(name);
}
Console.WriteLine("\nAll Employees");
Console.WriteLine(String.Empty.PadRight(13, '-'));
foreach (var name in employees.Names())
{
Console.WriteLine(name);
}
Lost now is the names in order. This can be remedied by making names always be in order, or developers always be in order. But which is it? I suppose it depends on the rules of the system.
Wait, There’s a Chain
That is right. I suppose after reducing the chain, we’ve now expanded the chain to give more flexibility. I would reduce the chain, or provide the reduced version, if an ordered Developers().Names() was frequently required.
Conclusion
There is nothing wrong with writing extension methods on IEnumberable<[class]>. As with all code be deliberate in your design so that it is semantically viable. Code should be natural to write. Avoid side effects to avoid unnecessary bugs. If you are designing a framework, make it a joy to use.