header image
 

Be wary of false LINQ “optimizations”

LINQ in C# is great. It often simplifies things. But at what cost?

Suppose we have a simple task: to iterate over a collection and do something on elements matching a specific predicate/condition:

foreach (var obj in list)
{
    if (!SomeCondition(obj))
        continue;
    DoSomething(obj);
}

It’s tempting to write the above loop in “simplified” form:

foreach (var obj in list.Where(x => SomeCondition(x))
{
    DoSomething(obj);
}

Resharper even suggests such changes by default. It must be good then, right? Not necessary. As always when things involve performance, measurement is the king.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace CsTest
{
    // reference type to simplify field assignment in foreach
    class Num
    {
        public int x;
    }

    class Program
    {
        const int count = 100000;
        const int loops = 10000;

        static void Fun1(List<Num> list)
        {
            for (int loop = 0; loop < loops; loop++)
            {
                foreach (Num num in list)
                {
                    if (num.x < count / 2)
                        continue;
                    num.x = 1;
                }
            }
        }

        static void Fun2(List<Num> list)
        {
            for (int loop = 0; loop < loops; loop++)
            {
                foreach (Num num in list.Where(num => num.x >= count/2))
                {
                    num.x = 1;
                }
            }
        }

        static void Main(string[] args)
        {
            Random rnd = new Random();
            Stopwatch sw = new Stopwatch();
            List<Num> list1 = new List<Num>(count);

            for (int i=0; i<count; i++)
                list1.Add(new Num {x = rnd.Next(count)});
            List<Num> list2 = new List<Num>(list1);

            sw.Restart();
            Fun1(list1);
            sw.Stop();
            Console.WriteLine("Without LINQ: {0}", sw.Elapsed);

            sw.Restart();
            Fun2(list2);
            sw.Stop();
            Console.WriteLine("With LINQ: {0}", sw.Elapsed);
        }
    }
}

On my machine (x86, release) the results are:

Without LINQ: 00:00:06.4973107
With LINQ: 00:00:12.0859563

No matter how big or small loop counters are, LINQ version is consistently 2x slower than non-LINQ one. Surprise, eh?

It’s not that much of a surprise if you think about it. Where() operator essentially creates a temporary collection with elements that satisfy the predicate. So you not only pay the price of increased CPU load, but also increased number of allocated objects and increased GC pressure. Not a good thing, especially if that code runs once per frame or in other tight spots.

As always, take nothing for granted – observe and measure!

~ by omeg on August 19, 2011.

C#, code, performance

Leave a Reply