BasicEffect optimizations in XNA Game Studio 4.0

Originally posted to Shawn Hargreaves Blog on MSDN, Sunday, April 25, 2010

The BasicEffect API and feature set did not change in Game Studio 4.0, but the implementation saw some aggressive optimizations.

In previous versions, BasicEffect was intended as a starting point for beginners. We expected expert programmers to soon move on to writing their own shaders, so as long as BasicEffect performed adequately on both Windows and Xbox, it wasn't worth spending time on further optimization.

Windows Phone changed this priority for two reasons:

I though it would be interesting to describe the details of how we sped things up.

 

Shader permutations

In spite of its name, BasicEffect is actually quite complex! If you look at the HLSL source code, you will see that the previous versions included 12 different vertex shaders to support all permutations of these options:

Plus 4 different pixel shaders:

BasicEffect has many other adjustable knobs, but we did not include specialized shaders for every possible combination. For instance we always evaluated the fog equation, and implemented the FogEnable property by setting parameter values to make the result come out zero if we did not want fog.

There is a balance between providing many shader permutations (which minimizes GPU instruction counts), versus fewer shaders (which minimizes memory overhead and development/test cost). When we reevaluated this balance in the light of Windows Phone, we decided to add more specialized shaders. As of Game Studio 4.0, BasicEffect now has a total of 32 permutations. There are 20 vertex shaders:

Plus 10 pixel shaders:

Implications:

 

Preshaders

A common tension in shader programming is that when you design effect parameters to provide a nice clean API, the resulting parameter formats are not always the most efficient for HLSL optimization.

D3D tries to correct any such mismatches through a feature called "preshaders". The HLSL compiler looks for computations that are the same for all vertices or all pixels, and moves these out of the main shader into a special setup pass which runs on the CPU before drawing begins. This is a great feature, but has a couple of fatal flaws:

Game Studio 4.0 adds the ability to implement preshader computations in C#, by overloading this new method, which is called immediately before EffectPass.Apply sets parameter values onto the graphics device:

    protected virtual void Effect.OnApply();

This allows BasicEffect to expose whatever properties the API requires, without needing these to match the underlying HLSL shader parameters. When the programmer changes a managed property, we just set a dirty flag, then recompute derived HLSL parameter values during OnApply. We used this new ability to precompute many things:

 

Do less work

We also applied some good 'ole algebraic optimizations, using math to find cheaper ways of getting the results we wanted.

We got a nice win from vectorizing the lighting computations, using matrix operations to evaluate all three lights at the same time. The new code is harder to read, but a couple of instructions shorter.

One place we slightly changed the final output is the fog equation. Previous versions used distance fog, which is computed from the distance between camera and vertex. We now use depth fog, which only considers how far in front of the camera each vertex is, ignoring any sideways offset. The visual difference is subtle, but depth fog is much cheaper to evaluate.

 

Results

To take one example, here are the instruction counts for BasicEffect using vertex color and texture, but no lighting or fog:

  Vertex Shader Pixel Shader
Game Studio 3.1 30 6
Game Studio 4.0 6 3

 

Preemptive question: "can we get the source for these optimized shaders?"

I would certainly love to release this, but we haven't worked out he details yet. Stay tuned!

Blog index   -   Back to my homepage