Twin paths to garbage collector nirvana

Originally posted to Shawn Hargreaves Blog on MSDN, Monday, July 2, 2007

In my previous post I described how to tell if your Xbox garbage collection is taking too long. We saw how the total amount of time spent on garbage collection is the product of the number of collections with the collection latency.

This suggests two radically different approaches for improving overall garbage collector performance. You can either reduce the number of collections, or you can reduce the collection latency. As long as you make one of these values nice and small, it doesn't matter how big the other is: either one alone is enough to achieve good performance.

 

Path 1: right frequency

How often the garbage collector runs is directly proportional to how often you allocate memory. If you never allocate anything, it will never run.

To reduce the frequency of garbage collections, you should allocate everything you are ever going to need up front while loading your levels, and avoid allocations during gameplay.

There are several aspects to avoiding memory allocation:

Don't allocate memory (duh!)

This is simple: do not call new on reference types. It is ok to new value types such as Matrix, Vector3, and Color, however.

Any time you find yourself wanting to new a reference type, use an object pool to reuse existing instances instead. The Particle and Audio 3D samples on creators.xna.com use this technique, and SwampThingTom blogged about a reusable pool collection.

Don't use classes that allocate on your behalf

When you add data to a collection class such as List<T> or Dictionary<K,V>, this may have to allocate memory to expand the collection. You can avoid that by using the collection constructor overloads which have explicit capacity parameters. Specify a capacity to preallocate as much memory as you will ever need for the maximum number of objects you intend to store in the collection.

Beware of string formatting. It is hard to manipulate strings in .NET without causing allocations.

Don't make the CLR runtime allocate

The CLR runtime allocates memory when boxing occurs. Avoid this like the plague! Boxing can happen for many reasons, some obvious, others less so:

Don't make the C# compiler allocate

It can be tricky to use delegates (especially delegates which form lexical closures) without causing the compiler to allocate memory. This is a whole topic in itself, but if in doubt you should avoid using delegates or events.

Generator methods will always require the compiler to allocate memory.

The foreach statement can allocate if used carelessly. But note, this does not mean you should avoid using foreach! It is often the fastest way to iterate over a collection. Learn the rules and use it appropriately.

Everything that does not allocate is ok

Disciples of the frequency path are allowed to have large and complex data structures. They can happily allocate hundreds of thousands of objects while their game is loading, filling the heap with a crazy maze of cross-linked object references. As long as they allocate nothing after loading has completed, the garbage collector will never run, so it makes absolutely no difference how large or complex their heap may be.

 

Path 2: right latency

How long the garbage collector takes to run is directly proportional to how complex your heap is. If the heap is empty, it will have no work to do.

Note that I said "how complex your heap is" and not "how large". Heap complexity is a combination of the number of live objects and the number of object references. It makes no difference how many bytes each object happens to take up: it is the total number of objects (because the garbage collector must examine each one) and references to objects (because the garbage collector must follow each reference to see what it is pointing to) that matter.

When measuring heap complexity, an array of 100,000 integers is very cheap. Even though that is a lot of memory, the garbage collector only has to examine the object once, and does not need to bother looking inside it.

100,000 arrays each containing a single integer would be far more expensive, because the garbage collector would have more objects to examine.

A single array of 100,000 object references is also more expensive, because although the array itself is only a single object, now the garbage collector must also scan the contents of the array to see if any of the objects inside it need to be kept alive. Even if the entire array contains nothing but null values, the garbage collector must still examine each element to make sure.

Here are some things that can reduce heap complexity:

Disciples of the latency path are free to allocate memory while their game is running. They are allowed to call new, to cause boxing, and to use anonymous delegates and generator methods. Who cares if this will cause garbage collections, as long as their heap is so simple that these collections will finish quickly?

 

Compromise will get you nowhere

You can achieve good performance by avoiding allocations, so the garbage collector will never run. This works regardless of how complex your heap may be.

You can also achieve good performance by keeping your heap nice and simple, so the garbage collector will finish quickly. This works regardless of whether you allocate memory during gameplay.

But you cannot achieve good performance by mixing these two approaches! There is little to be gained by allocating only a medium amount of memory and having only a medium complex heap. That will produce a game which has a medium size glitch every medium number of seconds.

If your goal is to avoid garbage collection glitches altogether, you must pick just one path and follow it to the bitter end.

 

Give yourself a head start

Programmers new to the world of garbage collection often think they may be able to improve performance by calling GC.Collect at carefully chosen times.

They are almost always wrong. Explicitly forcing garbage collection is a recipe for confusing the collector and hurting your performance.

But...

On Xbox, the .NET Compact Framework performs a garbage collection after one megabyte has been allocated.

Let's say we are optimizing our game to avoid allocations. After careful use of the CLR Profiler we have reduced our allocations to only 50 bytes per frame, but we simply can't figure out any way to improve beyond this point.

Furthermore, let's say this game runs at 60 frames per second, and that a typical level takes 2 minutes to complete. By the end of the level we will have allocated only 352k: not enough to cause a garbage collection. In fact our gameplay can last 5 minutes without a collection, so the only time the player will see a garbage collection glitch is if they deliberately mess around and waste time.

Seems reasonable, neh? We can probably live with that.

But...

We will be allocating plenty of memory while the level is loading, and thus causing many collections. That is no problem itself: it is fine for the garbage collector to kick in at this point, because the loading screen doesn't care about the resulting framerate glitch.

But what if our loading happens to allocate slightly less than an even number of megabytes, let's say 24,452k in total? After the last collection at the 23 megabyte mark, this load operation allocates too little to trigger another collection, but enough to leave us without any headroom. Now our game can only allocate 124k before triggering a collection, so the player will see a glitch just 42 seconds into the level.

The solution is to manually call GC.Collect at the very end of your load method. That will clean up any leftover loading garbage, resetting the clock so you can last longer before the next collection threshold will be reached.

Blog index   -   Back to my homepage