header image
 

Optimizing Garbage Collector performance in your .NET application

Garbage collection (GC) statistics is one of the main performance metrics for any .NET application. I won’t go into details on how it works, you can read about that here for example. Below are some very basic facts:

  • Managed heap is organized into generations: Gen0, Gen1 and Gen2. Every object existing on the heap belongs to one.
  • Garbage collector works on each generation separately. If an object survives GC collection (because it’s still referenced) it is moved to the older generation: from Gen0 to Gen1 and from Gen1 to Gen2. This is called “object aging”.
  • Gen0 and Gen1 collections are relatively fast compared to Gen2, which is much slower.

This already gives us the idea on how to optimize GC performance: Don’t keep references to objects for too long if you don’t have to. This way not many objects will survive collection and get promoted to a slower generation.

Alright, but how to actually check those GC statistics? There are a few ways. For a quick assessment you can use Process Explorer. Just double-click on your process (managed processes have yellow background) and click on the “.NET Performance” tab, then select “.NET CLR Memory” option. It’s really handy.


There are a few statistics that interest you:

  • # Gen 0/1/2 Collections – how many times GC was invoked for a particular heap generation. Ideally you want everything in Gen0, but it’s rarely possible. Rule of thumb for good performance is: 10 Gen0, 1 Gen1, 0.1 Gen2 collections per second.
  • % Time in GC – how much time CLR spends in GC code instead in your own. Lower the better – aim for low single digits here.
  • Finalization Survivors – number of objects during the last garbage collection that survived due to a finalizer. Don’t use finalizers unless absolutely necessary! They cause your objects to be promoted at least once.

You can also see managed heap’s size and other less important things.

So, what can we deduce from above screen shot? You can’t see here, but Gen0 collection rate is around 30 per second, and Gen1 about 5. That’s certainly not optimal. There is also consistently several thousand objects that survive collections – that’s outright BAD. What can we do about that? Meet CLR Profiler.

CLR Profiler is a free download from Microsoft. It can provide detailed statistics about CLR memory management. It also draws pretty graphs.

After extracting the files and launching CLRProfiler.exe you see this:

Choose your CLR version and make sure to tick “Profile: Allocations” option. Then select Start Application and browse for your executable. Note: you might want to set the working directory in File – Set Parameters menu, or just copy the profiler into your app’s folder. Profiler will launch your application. When you are done, just close it. Profiler will need a while to parse collected data and will show a window similar to this:

Now we can start real investigation. Allocated bytes histogram will show a breakdown of allocated objects by size like that. But we want to know which objects are being promoted and cause unnecessary GCs. Relocated objects histogram is a good indication, since object mostly move around the heap when they are being promoted:

Objects finalized shows us what types were finalized (and therefore waited for collection unnecessarily long), either by having explicit finalizer, or they are IDisposable but no one called Dispose() on them.

Right. We can clearly see that the biggest offenders come from SFML. Indeed, most SFML types implement IDisposable since they often use some unmanaged resources. I use them in SFML GWEN renderer, so that’s the logical place to look at. But we’ll be lazy and open CLRProfiler’s Allocation Graph:

It shows what code is responsible for allocations of particular type and is the best thing since sliced bread. Let’s see who is responsible for leaking SFML.Graphics.View for example. Right side of the graph shows allocated types – if the one you are interested in is not there, you can increase graph detail (upper right). We see that the allocation pattern (backwards) is: SFML.Graphics.View – Sfml.Graphics.RenderWindow::GetView – Gwen.Renderer.SFML::StartClip. Voila, that is in our code:

public override void StartClip()
{
    Rectangle rect = ClipRegion;
    // OpenGL's coords are from the bottom left
    // so we need to translate them here.
    var v = m_Target.GetViewport(m_Target.GetView());
    rect.Y = v.Height - (rect.Y + rect.Height);
            
    Gl.glScissor(Global.Trunc(rect.X*Scale), Global.Trunc(rect.Y*Scale),
                    Global.Trunc(rect.Width*Scale), Global.Trunc(rect.Height*Scale));
    Gl.glEnable(Gl.GL_SCISSOR_TEST);
}

m_Target is a RenderWindow. There is a call to GetView() which produces a new View object:

public View GetView()
{
    return new View(sfRenderWindow_GetView(This));
}

Viev is IDisposable, but we don’t call Dispose() on it anywhere. Let’s fix that and see how it affects GC performance.

public override void StartClip()
{
    Rectangle rect = ClipRegion;
    // OpenGL's coords are from the bottom left
    // so we need to translate them here.
    var view = m_Target.GetView();
    var v = m_Target.GetViewport(view);
    rect.Y = v.Height - (rect.Y + rect.Height);
    view.Dispose();
            
    Gl.glScissor(Global.Trunc(rect.X*Scale), Global.Trunc(rect.Y*Scale),
                    Global.Trunc(rect.Width*Scale), Global.Trunc(rect.Height*Scale));
    Gl.glEnable(Gl.GL_SCISSOR_TEST);
}

Already there is a noticeable decrease in Finalization Survivors and Gen1 collections. Let’s check the details with CLR Profiler.

Yep, we got rid of leaking View objects. Good job! Now we only need to eliminate other leaking objects and we’re set. After this is done, we won’t have survivors any more:

There is still room for improvement, but with Gen1 collection frequency below 1/second I’m content.

Whew, that took some time to write. I hope you find this “tutorial” useful. Few last words:

  • Always clean IDisposable objects up. Look up libraries you use and what types they return.
  • Don’t use finalizers unless your class owns some unmanaged handles directly. It’s not C++.
  • If your class owns IDisposable objects, also implement IDisposable. And clean after yourself!

~ by omeg on August 17, 2011.

C#, code, performance, troubleshooting

Leave a Reply