code4k: November 2011

Thursday, November 24, 2011

Advanced HLSL using closures and function pointers

Shader languages like HLSL, Cg or GLSL are nowadays driving the most powerful processors in the world, but if you are developing with them, you may have been already a little bit frustrated by one of their expressiveness limitations: the common problem of abstraction and code reuse. In order to overcome this problem, solutions so far were mostly using a glue combination of #define/#include preprocessors directives in order to generate combinations of code, permutation of shaders, so called UberShaders. Recently, this problem has been addressed, for HLSL (new in Direct3D11), by providing the concept of Dynamic Linking, and for GLSL, the concept of SubRoutines, For Direct3D11, the new mechanism has been only available for Shader Model 5.0, meaning that even if this could greatly simplified the problem of abstraction, It is unfortunately only available for Direct3D11 class graphics card, which is of course a huge limitation...

But, here is the good news: While the classic usage of dynamic linking is not really possible from earlier version (like SM4.0 or SM3.0), I have found an interesting hack to bring some kind of closures and functions pointers to HLSL(!). This solution doesn't involve any kind of preprocessing directive and is able to work with SM3.0 and SM4.0, so It might be interesting for folks like me that like to abstract and reuse the code as often as possible! But let's see how It can be achieved...

Friday, November 18, 2011

Direct3D11 multithreading micro-benchmark in C# with SharpDX

Multi-threading is an important feature added to Direct3D11 two years ago and has been increasingly used on recent game engine in order to achieve better performance on PC. You can have a look at "DirectX 11 Rendering in Battlefield 3" from Johan Anderson/DICE which gives a great insight about how it was effectively used in practice in their game engine. Usage of the Direct3D11 multithreading API is pretty straightforward, and while we are also using it successfully at our work in our R&D 3D Engine, I didn't take the time to sit down with this feature and check how to get the best of it.

I recently came across a question on the gamedev forum about "[DX11] Command Lists on a Single Threaded Renderer": If command lists are an efficient way to store replayable drawing commands, would it be efficient to use them even in a single threaded scenario where lots of drawing commands are repeatable?

In order to verify this, among other things, I did a simple micro-benchmark using C#/SharpDX, but while the results are somehow expectable, there are a couple of gotchas that deserve a more in-depth look...