Bookmark and Share

Summation Functions

by KodefuGuru 26. July 2010 17:31

Creating a summation function in C# isn’t difficult in itself. After all, it is perfectly reasonable to create functions that consist of more than one line. What is interesting is abstracting it so that any function can become a summation function.

Here is the Madhava-Leibniz function for pi:

Madhava-Leibniz Formula for Pi

Here’s how we can define the inner function:

Func<Int32, Double> f = k =>  4 * (Math.Pow(-1, k) / ((2.0 * k) + 1));

What we want to be able to do is call f.Sum(). This will generate a summation function that calculates pi. Here’s how we accomplish that task.

public static Func<Int32, Double> Sum(this Func<Int32, Double> func, int start = 0)
{
    return end =>
    {
        Double result = default(Double);

        for (int k = start; k <= end; k++)
        {
            result += func(k);
        }

        return result;
    };
}

We need to accumulate the results of every pass into the function. The resulting summation function will still require a parameter representing the upper bound. This is because we actually do need to return at some point in time to inspect the result.

Here’s the test to prove this works.

[TestMethod]
public void Pi()
{
    Func<Int32, Double> f = k =>  4 * (Math.Pow(-1, k) / ((2.0 * k) + 1));
    var calculatePi = f.Sum();
    Assert.AreEqual(3.14159, calculatePi(200000), 7);
}

With 200,000 passes, we can ensure an accuracy of 7. Leibniz didn’t rediscover the most efficient way of calculating pi, but it works for a test of the Sum extension method.

If we switch context to VB, we can even use this with a query expression! A couple of things. 1) VB 10 has a bug and can’t handle the optional parameter… change it to a traditional overload and 2) we need a Cast method.

I’ll let you figure out how to make a traditional overload. The cast method won’t actually do anything (because it will break the function), but we need it because the query expression will try to use it.

public static Func<T, TResult> Cast<TCast, T, TResult>(this Func<T, TResult> func)
{
    return func;
}

Here’s the VB test that uses LINQ!

<TestMethod()>
Public Sub Pi()
    Dim f As Func(Of Integer, Double) = Function(k) 4 * (Math.Pow(-1, k) / ((2.0 * k) + 1))
    Dim calculatePi = Aggregate x In f Into Sum()
    Assert.AreEqual(3.14159, calculatePi(200000), 7)
End Sub

I think it’s easier to just call f.Sum(), but this should give you a taste of where I will be taking things with future articles.

Download the source code from CodePlex!

Bookmark and Share

Monadic Comprehensions Presentation at IEEE ICCSIT 2010

by KodefuGuru 10. July 2010 02:43

Here is the slide deck from my talk at IEEE ICCSIT 2010. The preparation for this conference and my research into monadic comprehensions and functional composition happened to coincide, so I decided I should make that my presentation. China has been an adventure, enjoy the code I created to go along with it!

Synopsis: Build monads using the C# language with a C# style, then use the appropriate methods to ensure the LINQ query syntax works with this functional design pattern. After describing monads, cut the middle man and apply the same techniques directly to objects and functions to achieve better results with a declarative syntax.

Source code from demo is available at monadic.codeplex.com.

Bookmark and Share

Responsible Extension Methods

by KodefuGuru 18. May 2010 20:47

In LINQ is Better Than ForEach, I described how to use the reduce chain refactoring to better describe the functionality of the code. By encapsulating what we were doing, the calling code became more declarative, and the the functionality was more reusable.

This didn’t sit well with one reader:

Adding such extension methods to IEnumerable<T> as you suggest is IMHO a clear violation of class responsibility. The IEnumerable is a _collection_ class. It's purpose is to manage a list/set/bag/whatever of some objects, not to access or even worse change the contained objects' state. By polluting the IEnumerable<T> with "T-specific" extension methods (e.g. DeveloperNames()), you effectively break the class contract.
Example: your Employee class changed - Role field was deleted. This means your IEnumerable<Employee> is now also broken. Why - does removing anything from the object conceptually influence anything in managing a set of those objects? No, why should it. And yet the code is broken, only because the collection class was welded to much with the items it contains.

The Cake is a Lie

I added DeveloperNames() to IEnumerable<Employee>. The purpose of IEnumerable<Employee> is to provide an interface for the GetEnumerator() method. It isn’t even to manage a list, set, or bag of employees. The only thing it does is allow you to iterate over employees.

In fact, it’s all a lie. DeveloperNames() exists in a completely separate class.

public static class EnumerableEmployee
{
    public static IEnumerable<string> DeveloperNames(this IEnumerable<Employee> employees)
    {
        return employees.Where(e => e.Role == Role.Developer)
                        .OrderBy(e => e.LastName)
                        .Select(e => e.FullName);
    }
}

IEnumerable<Employee> is not a class, it’s an interface. I created a helper, or utility, class called EnumerableEmployee. The purpose of this class is to contain utility methods for IEnumerable<Employee>. The problem is that utility classes have logical cohesion. I desired, semantically at least, functional cohesion. By using the “this” keyword, I told the compiler to treat the first parameter as though it actually owned the method.

Principles

Extension methods adhere to the open/closed principle, which states that software entities should be open for extension, but closed for modification.” By their very nature, the classes or interfaces being extended are merely parameters into a static method.

The commenter asserts I am violating “class responsibility.” The single responsibility principle states that “every object should have a single responsibility, and that responsibility should be entirely encapsulated by the class.” As stated above, the only class is a utility class that provides syntactic sugar to objects that implement IEnumerable<Employee>. In adhering with the above open/closed principle, those objects are extended.

IEnumerable<Employee> still provides GetEnumerator(). Employee[] is still an array of Employee objects. List<Employee> is still a list of employees. They never changed. The coupling is semantic and applies where the utility class is in scope.

Where Does DeveloperNames() Belong?

The reduce chain refactoring calls for placing the resulting method (whether extension, static, or instance)  in the same chain loop it originally belonged. If the original method chain was correct, this will lead to correct functional cohesion. If there is a problem, then the method should be moved, but there usually is an underlying problem to begin with.

In the case of DeveloperNames(), it started as a reduce chain on a LINQ to Objects query. Encapsulating has the following benefits:

1) makes it more declarative
2) makes it testable
3) makes it reusable with DRY

IEnumerable<Employee> represents any object that implements that interface… from an array to List<Employee> to a custom employee collection. A properly designed utility method would accept the most general interface, IEnumerable<Employee>, so any of those object could be passed in.

Polluting IEnumerable<T> with T-Specific Extension Methods

That’s been done with LINQ to Objects. One should be careful with IEnumerable<T>, since it everything can access it. Only the most general methods belong there, and most have been covered that I’m aware of. IEnumerable<[class]> should be readily extended though. The only caveat is I would try to avoid those extensions that cause side effects, if possible.

No Side Effects

The commenter states the IEnumerable<Employee> is broken, that the collection is too welded to the item. Since the code works fine, I can only assume that it is suggested the original collection is being modified.

The examples I used did not create side effects. Even after I refactored to the foreach version of DeveloperNames(), there were no side effects present, so the employee classes were not changed, and the employees instance was unaffected.

This is important, because one must be careful to only make deliberate side effects. In this example, no side effects were necessary to accomplish any task, so none were used.

var employees = new[] 
{
    new Employee{FirstName = "Chris", LastName = "Eargle", 
        Role = Role.Developer},
    new Employee{FirstName = "Mr.", LastName = "Scrooge",
        Role = Role.Manager},
    new Employee{FirstName = "Steve", LastName = "Andrews",
        Role = Role.Developer}
};

Console.WriteLine("Developer Names");
Console.WriteLine(String.Empty.PadRight(15, '-'));
foreach (var name in employees.DeveloperNames())
{
    Console.WriteLine(name);
}

Console.WriteLine("\nAll Employees");
Console.WriteLine(String.Empty.PadRight(13, '-'));
foreach (var employee in employees)
{
    Console.WriteLine(employee.FullName);
}

I ran the code with both versions of the extension method (which you can get from the original article), and the results were the same.

Developer Names
---------------
Steve Andrews
Chris Eargle

All Employees
-------------
Chris Eargle
Mr. Scrooge
Steve Andrews

I never figured out what was meant by the Role being removed. The original LINQ statement returned an IEnumerable<string>, and you would expect something similar from a method named DeveloperNames().

But still, I don’t like DeveloperNames()

My problem with DeveloperNames() isn’t for any of the reasons the commenter posted. I only used it as an example of encapsulation to maintain declarative code whether one was using LINQ or imperative code. If I were truly designing this, I believe I would make the syntax as natural as possible. I would prefer to call employees.Developers().Names(). This would also give me the ability to call employees.Names() to get the full list of names. Let’s try that out.

public static class EnumerableEmployee
{
    public static IEnumerable<Employee> Developers(this IEnumerable<Employee> employees)
    {
        return employees.Where(e => e.Role == Role.Developer);
    }

    public static IEnumerable<string> Names(this IEnumerable<Employee> employees)
    {
        return employees.Select(e => e.FullName);
    }
}

Here’s the calling code.

Console.WriteLine("Developer Names");
Console.WriteLine(String.Empty.PadRight(15, '-'));
foreach (var name in employees.Developers().Names())
{
    Console.WriteLine(name);
}

Console.WriteLine("\nAll Employees");
Console.WriteLine(String.Empty.PadRight(13, '-'));
foreach (var name in employees.Names())
{
    Console.WriteLine(name);
}

Lost now is the names in order. This can be remedied by making names always be in order, or developers always be in order. But which is it? I suppose it depends on the rules of the system.

Wait, There’s a Chain

That is right. I suppose after reducing the chain, we’ve now expanded the chain to give more flexibility. I would reduce the chain, or provide the reduced version, if an ordered Developers().Names() was frequently required.

Conclusion

There is nothing wrong with writing extension methods on IEnumberable<[class]>. As with all code be deliberate in your design so that it is semantically viable. Code should be natural to write. Avoid side effects to avoid unnecessary bugs. If you are designing a framework, make it a joy to use.

Bookmark and Share

Dyadic Map and Higher in .NET 4

by KodefuGuru 6. May 2010 08:10

.NET has included monadic map functionality since the release of LINQ in .NET 3.5. You may know this by the extension method on IEnumerable<T>, Select. This allows you to project from one sequence to another. However, that’s as far as it went. Applying a map on multiple lists was not available in the core framework. That has changed in .NET 4 with the new Zip method.

Zip gives you the ability to map two sequences into one sequence; in effect, zipping it up. Here’s an example:

var ids = new[] { 1, 2, 3 };
var firstNames = new[] { "George", "John", "Thomas" };
var presidents = ids.Zip(firstNames, (i, f) => Tuple.Create(i, f));

zipperI first created an array of numbers, then an array of strings. Then, I zipped it up to create pairs of each one. Here’s the output when the pairs are written to the console:

(1, George)
(2, John)
(3, Thomas)

One thing I find disappointing is that there are no overloads provided for Zip. What if I need to map more than two sequences? It’s fairly trivial to implement, and I believe it should have been included in the .NET framework. For now, we will have to settle with creating our own.

public static class Enumberable
{
    public static IEnumerable<TResult> Zip<TFirst, TSecond, TThird, TResult>
        (this IEnumerable<TFirst> first, IEnumerable<TSecond> second, 
            IEnumerable<TThird> third, Func<TFirst, TSecond, TThird, TResult> selector)
    {
        if (first == null)
        {
            throw new ArgumentNullException("first");
        }
        if (second == null)
        {
            throw new ArgumentNullException("second");
        }
        if (third == null)
        {
            throw new ArgumentNullException("third");
        }
        if (selector == null)
        {
            throw new ArgumentNullException("selector");
        }

        using (var firstEnumerator = first.GetEnumerator())
        using (var secondEnumerator = second.GetEnumerator())
        using (var thirdEnumerator = third.GetEnumerator())
        {

            while (firstEnumerator.MoveNext() && secondEnumerator.MoveNext()
                && thirdEnumerator.MoveNext())
            {
                yield return selector(firstEnumerator.Current, secondEnumerator.Current,
                    thirdEnumerator.Current);
            }
        }
    }
}

This creates a Zip extension utilizing three sequences. If you need more, just copy the code and add appropriate code for appropriate arguments. **EDIT** I forgot to add using clauses.

With this version, I will create an anonymous type to name the properties instead of using a tuple.

var ids = new[] { 1, 2, 3 };
var firstNames = new[] { "George", "John", "Thomas" };
var lastNames = new[] { "Washington", "Adams", "Jefferson" };
var presidents = ids.Zip(firstNames, lastNames,
    (i, f, l) => new { Id = i, FirstName = f, LastName = l });

Here is the output when the presidents are written to the console.

{ Id = 1, FirstName = George, LastName = Washington }
{ Id = 2, FirstName = John, LastName = Adams }
{ Id = 3, FirstName = Thomas, LastName = Jefferson }

As with Select, you are not limited to anonymous types or tuples. This functionality is useful when reading data from disparate sources to create sequences of objects.

Bookmark and Share

LINQ is Better Than ForEach

by KodefuGuru 4. May 2010 18:37

I had a discussion today with a software architect who disagreed with me that LINQ to Objects should be used instead of foreach loops. My claim is that LINQ is better. He says that I shouldn’t make such a blanket statement, because LINQ is inefficient. I stand by my assertion.

Declarative Code is Easier to Read

From the initial development point of view, writing what you’re doing rather than how you’re doing it is more succinct, easier to read, and easier for others to maintain. Even if you don’t know LINQ, which is easier to decipher?

var developerNames = employees.Where(e => e.Role == Role.Developer)
                              .OrderBy(e => e.LastName)
                              .Select(e => e.FullName)
                              .ToArray();

or the 2.0 List<T> way

public class EmployeeLastNameComparer : IComparer<Employee>
{
    public int Compare(Employee x, Employee y)
    {
        return x.LastName.CompareTo(y.LastName);
    }
}

var employeeList = new List<Employee>();
employeeList.AddRange(employees);
employeeList.Sort(new EmployeeLastNameComparer());
var names = new List<string>();

foreach (var employee in employeeList)
{
    if (employee.Role == Role.Developer)
    {
        names.Add(employee.FullName);
    }
}

var developerNames = names.ToArray();

I much prefer to read a declarative code than iterative code. But the architect’s concern about performance is still valid. Let’s check the numbers. Using Rex, I will create an employees array with a thousand members.

string regexName = @"^[A-Z][a-z]+$";
RexSettings nameSettings = new RexSettings(regexName) 
{ 
    k = 1000, 
    encoding = CharacterEncoding.ASCII 
};

var firstNames = RexEngine.GenerateMembers(nameSettings);
var lastNames = RexEngine.GenerateMembers(nameSettings);
Random randomRole = new Random();            
var employees = firstNames.Zip(lastNames, 
                (f, l) => new Employee 
                            { 
                                FirstName = f, 
                                LastName = l, 
                                Role = (Role)randomRole.Next(3) 
                            })
                .ToArray();

I then timed both versions of the code using a StopWatch instance. Here are the results for the List<T> version.

00:00:00.0027104
00:00:00.0026925
00:00:00.0028171
00:00:00.0027148
00:00:00.0027858

And now the LINQ version.

00:00:00.0019929
00:00:00.0019156
00:00:00.0019871
00:00:00.0018066
00:00:00.0019116

Okay, clearly the LINQ version is optimized because the filter is occurring before the sort. It’s time to make the iterative code even uglier to squeeze some performance out of it (and remember, the LINQ version is functionally complete and rather clean).

var employeeList = new List<Employee>();

foreach (var employee in employees)
{
    if (employee.Role == Role.Developer)
    {
        employeeList.Add(employee) ;
    }
}

employeeList.Sort(new EmployeeLastNameComparer());
var names = new List<string>();

foreach (var employee in employeeList)
{
    names.Add(employee.FullName);
}

var developerNames = names.ToArray();

Now that I’ve optimized the code, the results are much better.

00:00:00.0014110
00:00:00.0013445
00:00:00.0013484
00:00:00.0016516
00:00:00.0013563

Is it really worth that mess to squeeze out a little bit of time? It really depends on your application. But as you’ll see, that’s still not an excuse. Besides, if you really cared so much about performance, you would use arrays and write your own, optimized sort methods. If you must write those applications, you’re probably using C on an embedded device and this posting is moot. For business applications, readability trumps premature optimization.

Take Advantage of the Hardware

It’s much easier to take advantage of multiple cores so present in today’s computers using LINQ. On .NET 4 (or using the Parallel Extensions with 3.5), it is as simple as adding an extension method or two.

var developerNames = employees.AsParallel().AsOrdered()
                              .Where(e => e.Role == Role.Developer)
                              .OrderBy(e => e.LastName)
                              .Select(e => e.FullName)
                              .ToArray();

Again, I must warn against premature optimization. Due to the speed of the original statement, the overheard is not worth the cost. It will actually make the routine slower. However, if I add a Thread.Sleep(10) to the getter of Employee.FullName, the difference is 5 seconds without The AsParallel() option to 2.5 seconds with it. Needless to say, optimizing the ForEach version to take advantage of the hardware isn’t as elegant. Maintaining order requires using a specific overload of Parallel.ForEach. This situation is easy since we can use an array, but do not doubt that it requires much work in many situations. Here is the piece for the code that needs to be optimized.

string[] names = new string[employeeList.Count];            

Parallel.ForEach(employeeList, (employee, loopState, elementIndex) =>
{
    names[elementIndex] = employee.FullName;
});

What If LINQ Really Doesn’t Work In My Situation?

The important thing isn’t whether or not you use LINQ, the important thing is to have readable code. Stating what you’re doing is more maintainable than how you’re doing it. Encapsulation is key. I feel it’s best to start with the LINQ statement, then optimize if necessary. Here are the steps to to squeeze out the milliseconds by going from the LINQ version of the code above to the iterative version while hiding the complexity, thereby maintaining readability.

The first thing that should be done is to use the reduce chain refactoring.

public static class EnumerableEmployee
{
    public static IEnumerable<string> DeveloperNames(this IEnumerable<Employee> employees)
    {
        return employees.Where(e => e.Role == Role.Developer)
                        .OrderBy(e => e.LastName)
                        .Select(e => e.FullName);
    }
}

Then call the method with the following.

var developerNames = employees.DeveloperNames().ToArray();

Of course, with a name like that, why do you even need a variable? I love taking a piece of code and making it express the essence of what it is.

Since the implementation for DeveloperNames() is encapsulated in the extension method, it’s a simple matter to change that implementation for the iterative version of the code we were using earlier.

public static class EnumerableEmployee
{
    public static IEnumerable<string> DeveloperNames(this IEnumerable<Employee> employees)
    {
        var employeeList = new List<Employee>();

        foreach (var employee in employees)
        {
            if (employee.Role == Role.Developer)
            {
                employeeList.Add(employee);
            }
        }

        employeeList.Sort(new EmployeeLastNameComparer());
        var names = new List<string>();

        foreach (var employee in employeeList)
        {
            names.Add(employee.FullName);
        }

        return names;
    }
}

Conclusion

The vast majority of code I come across that either can be written in LINQ or refactored to LINQ has no noticeable, negative performance impact, but it has a positive impact on maintainability. On top of that, LINQ statements can be made to scale with the hardware easier, and a more readable manner, than a collection of iterative statements. LINQ statements should still be refactored for even further readability, and by encapsulating the implementation it can be replaced with iterative code while hiding said code’s complexity.

Bookmark and Share

Refactoring Partition List

by KodefuGuru 20. December 2009 21:24

I came across an article on Visual C# Kicks that describes how to partition a list. The author’s goal is to split a list of elements into an array of smaller list, and the author succeeds at that. Unfortunately, the code appears to be designed for C# 2.0 instead of C# 3.0. I’m going to examine the second version of the method the author posted, explain the problems I have with it, then refactor it to be fluent. Please note that I made two modifications to the source: renamed the method from Partition2 to Partition and fixed a typo.

public static List<T>[] Partition<T>(List<T> list, int size)
{
    if (list == null)
        throw new ArgumentNullException("list");

    if (size < 1)
        throw new ArgumentOutOfRangeException("totalPartitions");

    int count = (int)Math.Ceiling(list.Count / (double)size);
    List<T>[] partitions = new List<T>[count];

    int k = 0;
    for (int i = 0; i < partitions.Length; i++)
    {
        partitions[i] = new List<T>(size);
        for (int j = k; j < k + size; j++)
        {
            if (j >= list.Count)
                break;
            partitions[i].Add(list[j]);
        }
        k += size;
    }

    return partitions;
}

The most glaring problem is that it returns a List<T>. Okay, actually, it’s returning an array of List<T>, but the problem is still the same. List<T> is a much too specific scenario. Unfortunately, we as developers became dependent on them between 2005 and 2008 because they were just so useful. If you’re designing a method for reusability, it is better to lean on the most general interface available. That interface is IEnumerable<T>.

The second problem is actually part of the first: it returns an array. We shouldn’t tie this to an array and instead make it IEnumerable<T> as well.

The third problem is that this method is not fluent at all. I assume this method sits in a helper class somewhere, but there is no extension made. So, we’re left calling a method on the helper class.

var partitions = ListHelper.Partition(list, 5);

If I want to partition some list, I want to call the partition method on the list. So, that’s how I’m going to refactor this: move it to IEnumerable<T> and make it an extension method.

The great news is that all that logic can be cleaned up by using LINQ. Here’s how I refactored the code.

public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> source, int size)
{
    if (source == null)
        throw new ArgumentNullException("list");

    if (size < 1)
        throw new ArgumentOutOfRangeException("size");

    int index = 1;
    IEnumerable<T> partition = source.Take(size).AsEnumerable();

    while (partition.Any())
    {
        yield return partition;
        partition = source.Skip(index++ * size).Take(size).AsEnumerable();
    }
}

You can now access the Partition method directly from any IEnumerable<T>. In addition, it is no longer returning an array. If you really desire an array, you can use the ToArray() method like any other IEnumerable<T> with LINQ.

var partitions = list.Partition(5).ToArray();

That refactoring was fun. If you have anything you’d like me to take a look at, send me a message through the contact form.

Edit: This is a minor change, but after thinking about it I decided that starting the index at 1 and using index++ was more readable than starting the index at 0 and using ++index.

Bookmark and Share

Why You Should Use the Null Coalescing Operator

by KodefuGuru 20. December 2009 02:49

Introduced in C# 2.0, the null coalescing operator (??) allows one to program in a concise, declarative fashion when performing null checks. This was important in 2005 as it made it easier to utilize another important feature: nullable types.

Nullable types solved the problem of representing a value type that also contained a null value. This representation of value types is common in databases, and oftentimes caused problems when doing relational to object mapping.

Nullable types are generic structs that can be declared as Nullable<T>. However, they are often used with the value type and a question mark, e.g. int?. Here’s an example of how to use a nullable type and the null coalescing operator.

int number = nullable ?? default(int);

The null coalescing operator isn’t necessary, however. As I mentioned above, it makes your code more concise, and declarative. There are two other ways to do this.

The imperative way:

int number = default(int);

if (nullable.HasValue)
{
    number = nullable.Value;
}

The longer, declarative way:

int number = nullable.HasValue ? nullable.Value : default(int);

The latter example is too long. The null coalescing operator achieves the same result with much less code, so it is preferable.

The imperative example will elicit some calls for support. I’ve heard the argument from several people before: “developers don’t understand this ?? thing, everyone understands if statements.” If that’s true in your organization, I would recommend teaching it to them. Start using the operator so they get used to it. If you keep providing them imperative crutches, they will continue to use them. No one learned standard object oriented programming overnight either. To continue pushing imperative code over reasonable declarative alternatives is akin to writing 150 line methods because that’s the way you did it in C.

So why am I writing about the null coalescing operator now, so many years after it came into being? The answer is that it can be combined with newer language features to make your code even more powerful.

The null coalescing operator with object initializers

In this example, I broke out some old school ADO.NET. The AdventureWorks database has a WorkOrder table with an EndDate column that allows nulls.

public static WorkOrder GetWorkOrderById(int id)
{
    using (var connection = 
        new SqlConnection
            (ConfigurationManager.ConnectionStrings["AdventureWorks"].ConnectionString))
    {
        connection.Open();
        var command = new SqlCommand(Queries.GetWorkOrderById, connection);
        command.Parameters.Add(new SqlParameter("@WorkOrderId", id));
        var reader = command.ExecuteReader();

        if (!reader.HasRows)
        {
            throw new ApplicationException(Exceptions.WorkOrderNotFound);
        }

        reader.Read();
        return new WorkOrder
        {
            Id = reader.GetInt32(0),
            Quantity = reader.GetInt32(1),
            Stock = reader.GetInt32(2),
            Scrapped = reader.GetInt16(3),
            StartDate = reader.GetDateTime(4),
            EndDate = reader[5] as DateTime? ?? DateTime.MaxValue,
            DueDate = reader.GetDateTime(6),
            ModifiedDate = reader.GetDateTime(7)
        };
    }
}

Unless the underlying schema changes, we will have a value for most of the columns. EndDate allows nulls in the database, but the WorkOrder object is straight DateTime in my sample application. The WorkOrder object has its values set in an object intializer, which requires a declarative syntax for all property initializations. The solve this in the simplest way possible, I used the null coalescing operator to make the end date the maximum possible value if the database has a null value.

The null coalescing operator with LINQ statements

LINQ is declarative by design, which makes it a good partner with the null coalescing operator where appropriate. Although you may never use it with a predicate (unless it’s a bool?), it is possible to use it with a selector.

int?[] dirtyNumbers = new int?[] { 1, null, 2, null, 3, 4, null, 5 };
var numbers = dirtyNumbers.Select(n => n ?? default(int));

You can extend this further make null have a meaning, such this function where null becomes the adjacent number in the aggregate function.

var values = new double?[] { 1.01, null, 1.24, 1.53, null, 1.93, 1.12 };
var value = values.Aggregate((x, y) => (x ?? y) * (y ?? x));

A more common scenario is that you want to retrieve an element out of enumerable. This is typically the first element, or perhaps it’s a single element. In any case, if it’s not found, you must retrieve it from elsewhere.

var workOrder = workOrders.SingleOrDefault(w => w.Id == id) ?? GetWorkOrderById(id);

I still see so many pieces of code where someone uses FirstOrDefault, or SingleOrDefault but leaves off the null coalescing operator. Instead, the programmer turns to imperative logic to handle the variable being null. It is important to point out that the null coalescing operator is useful for more than nullable types; it is useful for all reference types.

Bookmark and Share

Any() versus Count()

by KodefuGuru 7. December 2009 18:20

Jimmy Bogard brought to my attention today that I had been doing something wrong all along. I’ve been using Count() on my enumerations when I should have been using Any(). I’ve even done this in a presentation I’ve been giving the past few months. This is from Mash Up.

if (businesses.Count() == 0)
{
    return PartialView("Message", new MessageViewData { Header = "No Results Found", 
        Message = "We didn't find anything near your location." });
}
else
{
    return View(new ResultsViewData { Geolocation = geolocation, Businesses = businesses });
}

The problem, as Jimmy pointed out in his twitter feed, is that Count() iterates the entire enumeration (unless it is an ICollection<T>, via a commenter with Reflector). Any(), on the other hand, will return after it hits the first one. This should be rewritten.

if (businesses.Any())
{
    return View(new ResultsViewData { Geolocation = geolocation, Businesses = businesses });
}
else
{
    return PartialView("Message", new MessageViewData { Header = "No Results Found", 
        Message = "We didn't find anything near your location." });
}

The reason I used Count() is quite simple: I was too used to using the Count property on List<T>. I never bothered to look if there was a better way of doing this.

Just as Count() can take a parameter as a predicate, so can Any(). This is useful in that it prevents an unnecessary Where to be chained up.

if (primes.Any(p => primes.Contains(p + 4)) )
{
    Console.WriteLine("Prime numbers provided contains Cousin primes");
}

The scenario you should use Count() is when you need to know the actual count.

var cousins = primes.Count(p => primes.Contains(p + 4));
Console.WriteLine("{0} primes provided have cousins", cousins);

Use the method that fits the job. I know I will be paying more attention to Any() from now on.

Tags: , ,

Kodefu

Bookmark and Share

What is this Microsoft StreamInsight thing?

by KodefuGuru 1. December 2009 18:06

It appears that I skipped over a booth at PDC09, because I had no clue what StreamInsight was when I came across the Novermber CTP today. Of course, I downloaded it.

Microsoft® SQL Server® StreamInsight is a platform for the continuous and incremental processing of unending sequences of events (event streams) from multiple sources with near-zero latency. These requirements, shared by vertical markets such as manufacturing, oil and gas, utilities, financial services, health care, web analytics, and IT and data center monitoring, make traditional store and query techniques impractical for timely and relevant processing of data.
StreamInsight allows software developers to create innovative solutions in the domain of Complex Event Processing that satisfy these needs. It allows to monitor, mine, and develop insights from continuous unbounded data streams and correlate constantly changing events with rich payloads in near real time. Industry specific solution developers (ISVs) and developers of custom applications have the opportunity to innovate on and utilize proven, flexible, and familiar Microsoft technology and rely on existing development skills when using the StreamInsight platform.

The description seemed interesting enough, and it definitely seems useful for my company. But most importantly, does it work with twitter? Twitter… Unbounded data stream… that will be a great demo.

After installing the CTP, I looked for demos to describe how to use it. It installed code samples in %programfiles%\Microsoft StreamInsight November CTP\Samples\StreamInsightSamples.zip. Unzip it for a .NET 3.5 solution with samples in C#. There aren’t scripts as in a training kit, but the sample do show off key parts of the libraries (Microsoft.ComplexEventProcessing) they shipped.

If you’re interested in more information, view the sessions on Microsoft Insight from PDC 09. Torsten Grabs presents Introduction to Microsoft SQL Server 2008 R2 StreamInsight. Roman Schindlauer and Beysim Sezgin cohost Advanced Microsoft SQL Server 2008 R2 StreamInsight.

Bookmark and Share

IEnumerable<T>.ToDictionary()

by KodefuGuru 30. November 2009 23:37

When retrieving data from web services, it’s sometimes obvious it was meant to be represented as a dictionary. In my coworkers case earlier today, he was getting back a simple class from a web service written in Java (don’t blame me for the exposed fields).

public class KeyValuePairType
{
    public string Key;
    public string Value;
}

He wanted to access the values as nature (err, C#) intended it:Dictionary<string, string>. Since the data was being returned as an array of KeyValuePairTypes with unique values on the key, this was quite simple with LINQ.

var dataDictionary = additionalData.ToDictionary(k => k.Key, v => v.Value);

This is a great way to create a dictionary out a key/value pair scenario, but the syntax is even easier for your keyed business objects. For example, assume you have an enumerable of clients with an Id property. You want to have a dictionary of these clients so you can easily access them without the hassle of tons of LINQ statements.

var clientDictionary = clients.ToDictionary(c => c.Id);

The definition of clientDictionary is Dictionary<int, Client>, allowing you easy access to the client values.

The other two overloads of the ToDictionary method require one to create a class that implements IEqualityComparer. This comparer is used to determined if the key is already in the dictionary. EqualityComparer<T>.Default is used in the previous overloads.

KodefuGuru.GetInfo()

Chris Eargle
LinkedIn Twitter Technorati Facebook

Chris Eargle
C# MVP, INETA Community Champion


MVP - Visual C#

 

INETA Community Champions
Friend of RedGate
Telerik .NET Ninja
Community blogs & blog posts

I am a #52er

I have joined Anti-IF Campaign


World Map

Tag cloud

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.

© Copyright 2010