In the age of LINQ, we oftentimes use IEnumerable<T> over an array or List<T>. The namespace System.Linq adds extensions on this interface that allows for immutable operations on any set of values, and the query operations of LINQ make your code much cleaner.
For example, this is how I would filter a list of numbers before LINQ:
var even = new List<int>();
foreach (int i in nums)
{
if (i % 2 == 0)
{
even.Add(i);
}
}
With LINQ, this is simplified to the following:
var even = nums.Where(i => i % 2 == 0);
Prefer IEnumerable<T>
A best practice when programming is to code to an interface, not an implementation. Many people use the return type List<T> when the expected usage of the returning variable is to simply iterate it. Some people try to “program to an interface” by putting an I in front of List<T>, but this is fallacious since the purpose of IList<T> is to allow indexed access to it’s elements. That’s where IEnumerable<T> comes in, as that particular interface is meant to allow iteration over its elements. This should be your default interface if you wish to avoid unintended side effects.
Here is an example of an unintended side effect, using the even number example from above.
public static IList<int> EvenNumbers(this IList<int> nums)
{
for (int i = nums.Count; i >= 0; i--)
{
if (i % 2 != 0)
{
nums.Remove(i);
}
}
return nums;
}
Unbeknownst to the caller, the original list is being modified. This violates command-query separation, and creates an unintended side effect by modifying the original list. There are a couple of other issues as well: it won’t work on other expected types, such as collections that don’t implement IList<int>, and it will fail on arrays or other read-only types implementing IList<int>.
This can be made more flexible and less error prone by writing it for IEnumberable<int> rather than IList<int>:
public static IEnumerable<int> EvenNumbers(this IEnumerable<int> nums)
{
foreach (var i in nums)
{
if (i % 2 == 0)
{
yield return i;
}
}
}
This changes the structure of the method in a few ways: it’s lazy and it returns an immutable type. This is due to the compiler generated “enumerator,” which is the C# term for what everyone else calls the iterator pattern. In many situations, this is correct. Of course, it is possible that your system expects the side effects. I would recommend refactoring to use the immutable form unless there is a compelling reason otherwise; the mutable form is more prone to bugs (the consequence of unintended side effects). One thing to keep in mind is this: when returning an IEnumerable<T>, prefer use of the yield keyword. This practice will help you avoid those unintentional side effects.
The Case for Seqs
The one issue I might have with this is the terminology: IEnumerable<T>. Certainly, that’s what you’re working with in .NET, but the name is rather long and it requires some explanation to new .NET developers. What I would like to see in C#, and I feel C# needs it to change the mindset of many developers, is a new keyword: seq.
IEnumerable<T> simply represents a sequence of items. It can even be an infinite sequence!
public static IEnumerable<T> Repeat<T>(this T obj)
{
while (true)
{
yield return obj;
}
}
Imagine instead of writing IEnumerable<T> everywhere, you simply used seq<T>.
public static seq<int> EvenNumbers(this seq<int> nums)
{
foreach(var i in nums)
{
if (i % 2 == 0)
{
yield return i;
}
}
}
Besides being more concise, I feel this would solidify System.Collections.Generic.IEnumerable<T> as a privileged member of the C# ecosystem; similar to how System.String is with the string keyword.
The Primary Issue
My suggestion would be easy to implement, but the C# team is rightfully reluctant to add keywords. I’ve seen code break between versions due to their introduction, and you can be sure some people will experience the same issue with async in C# 5. Since most of these issues can be corrected with a simple find/replace, I feel the cost is negligible.
One way to alleviate some of the pain is to make seq only a keyword when using a generic parameter. This may not be practical for a couple of reasons: it is natural to map the non-generic seq keyword to the non-generic IEnumerable interface, and it would be inconsistent with other keywords. It would prevent breakage, but I prefer a pure language.
What Do You Think?
I will be attending the 2012 MVP Summit in a few short weeks, and I would like to take this feedback to Microsoft. One thing people may not realize is that Microsoft listens to the community. I support adding the seq keyword, do you?