But, here is the good news: While the classic usage of dynamic linking is not really possible from earlier version (like SM4.0 or SM3.0), I have found an interesting hack to bring some kind of closures and functions pointers to HLSL(!). This solution doesn't involve any kind of preprocessing directive and is able to work with SM3.0 and SM4.0, so It might be interesting for folks like me that like to abstract and reuse the code as often as possible! But let's see how It can be achieved...
A simple problem of abstraction and code reuse in HLSL
I have been working recently at my work on a GPU implementation of a versatile perlin/simplex/fbm/turbulence noise in HLSL. While some of the individual algorithm are pretty simples, it is often common to use several permutations of those functions in order to produce some nice noise and turbulences functions (like the worm-lava texture I did for Ergon 4k intro). Thus, they are an ideal candidate to demonstrate the use of closures and functions pointers. I won't explain here the basic principle of perlin and fbm noise generation to focus on the problem of code reuse in HLSL.
Here is a simplified version of a Turbulence Noise implemented in a Pixel Shader:
float PerlinNoise(float2 pos){ .... } float AbsNoise(float2 pos) { return abs(PerlinNoise(pos)); } float FBMNoise(float2 pos) { float value = 0.0f; float frequency = InitialFrequency; float amplitude = 1.0f; // Classic FBM loop for ( int i=0; i < Octaves; i++ ) { float noiseValue = AbsNoise(pos); value += amplitude * noiseValue; frequency *= Lacunarity; amplitude *= Amplitude; } return value; } // Turbulence noise: // Fbm + Abs + Perlin float TurbulenceAbsPerlinNoisePS(float4 pos : SV_POSITION, float2 texPos : TEXCOORD0) : SV_Target { return FBMNoise(texPos); }
The problem with the previous code is that if we want to change the code behind AbsNoise called from FBMNoise (for example, apply cos/sin on the coordinates, or use of a simplex noise instead of the old Perlin Noise), we would have to duplicate the FBMNoise function to call the other function. Of course, we could use the preprocessor to inline the code, but It would end up in something less readable, less debuggable, error prone...etc.
Another example: Ken Perlin introduced some really cool functions to modify the noise, like the famous marble effect:
static float stripes(float x, float f) { float t = .5 + .5 * sin(f * 2*PI * x); return t * t - .5; } float MarbleNoise(float2 pos) { return stripes(pos.x + 2 * FBMNoise(pos), 1.6f); }
But wait! The MarbleNoise function could even be used in place of the AbsNoise function, in order to get another noise effect. So we could have a marble function calling a FBM... but we could also have a marble function called by a FBM... or both... ugh... so as we can see, It is possible to permute those functions to generate interesting patterns, but unfortunately, the shading language doesn't provide us a way to make those functions pluggable!... Almost! In fact, there is a small breach in the HLSL language and we are going to use it!
Introduction to Dynamic Linking in HLSL
So as I said in the introduction, Direct3D11 has introduced the concept of dynamic linking. I suggest the reader to go to an explanation on msdn "Interfaces and classes". Basically, the main feature introduced in the HLSL language is a bit of Object Oriented Programming (OOP) in order to address the problem of abstraction: Now HLSL has the class and interface keyword. But they were mainly introduced for dynamic linking of a shader, and as I said, dynamic linking is only available with SM5.0 profile.
// An interface describing a light interface ILight { float3 ComputeAmbient(...); float3 ComputeDiffuse(...); float3 ComputeSpecular(...); }; // A 1st implem of the ILight interface class MyModelLight1 : ILight { float3 ComputeAmbient(...) { ... return color; } ... }; // A 2ns implem of the ILight interface class MyModelLight2 : ILight { float3 ComputeAmbient(...) { ... return color; } ... } // The variable through which we are going to access the light model ILight abstractLight; // We need to declare the two implems in order to get a reference // to them from C++ code MyModelLight1 modelLight1; MyModelLight2 modelLight2; float4 PixelShader(PS_INPUT Input ) : SV_Target { // Call the abstractLight that was previously setup by C++ at // PixelShader creation time float3 ambient = abstractLight.ComputeAmbient(Input.Pos); float3 diffuse = abstractLight.ComputeDiffuse(Input.Pos); float3 specular = abstractLight.ComputeSpecular(Input.Pos); return float4(saturate( Ambient + Diffuse + Specular ), 1.0); }
To be able to use this shader, we need to setup the abstractLight variable from the C++/C# code, through the usage of ID3D11Device::CreateClassLinkage and in the instatiation of a Pixel Shader ID3D11Device::CreatePixelShader.
As we can see, we need to declare the interface and classes variable globally, so that they can be accessed by the C++ program. This is the standard way to use dynamic linking in HLSL... but what If we want to use this differently?
Hacking function pointers in HLSL
The principle is very simple: Instead of using interface and classes as global variables, we can in fact use them as function parameters and even local variables from method. The way to use it is then straightforward:
// Base class for a calculator interface ICalculator { float Compute(...); }; // 1st implem of the calculator class ClassicCalculator : ICalculator { float Compute(...) { ... return value; } }; // 2nd implem of the calculator class ComplexCalculator : ICalculator { float Compute(...) { ... return value; } }; // A function using the interface ICalculator float MyFunctionUsingICalculator(ICalculator calculator, ...) { ... value += calculator.Compute(...); ... return value; } // A Pixel shader using the ClassicCalculator float PixelShader1(PS_INPUT Input ) : SV_Target { ClassicCalculator classic; return MyFunctionUsingICalculator(classic, ...); } // A Pixel shader using the ComplexCalculator float PixelShader2(PS_INPUT Input ) : SV_Target { ComplexCalculator complex; return MyFunctionUsingICalculator(complex, ...); }
The previous example could be compiled flawlessly with ps_4_0 (Shader Model 4) or ps_3_0 (with some minor changes for the pixel shader), and It would compile just fine! So basically, the interface ICalculator is acting as a function pointer, that has two implementations available through the ClassicCalculator and ComplexCalculator classes. MyFunctionUsingICalculator doesn't have to change its signature to adapt to the underlying function, so as we can see, we have a suitable solution for developing function pointers in HLSL.
Now, lets try to see if we could use this model to build our flexible noise functions. Replace ICalculator by a INoise interface. We are seeing that an implementation would have to call another INoise interface. In fact, ideally, we would like to code something like this:
// Base class for a noise function interface INoise { float Compute(...); }; // Perlin noise implem class PerlinNoise : INoise { float Compute(...) { ... return value; } }; // FBM noise implem class FBMNoise : INoise { // Would be ideal to be able to do that // We could even make an abstract generic class // that could provide a base Source INoise // BUT, THIS IS NOT COMPILING!!! INoise Source; float Compute(...) { float value = 0.0f; float frequency = InitialFrequency; float amplitude = 1.0f; // Classic FBM loop for ( int i=0; i < Octaves; i++ ) { // Call the source abstract INoise float noiseValue = Source.Compute(pos); value += amplitude * noiseValue; frequency *= Lacunarity; amplitude *= Amplitude; } return value; } }; // A Pixel shader using the FBMNoise combined with PerlinNoise float PixelShader1(PS_INPUT Input ) : SV_Target { FBMNoise fbmNoise; PerlinNoise perlin; // This is not possible, interface variable members are not allowed fbmNoise.Source = perlin; return fbmNoise.Compute(...); }
Unfortunately, HLSL doesn't permit the use of interface as variable members!. This limitation was quite annoying, as It excludes a whole range of combination, like aggregation, composition... making these function pointers useful only for a very limited set of cases...
I have tried to overcome this problem using abstract class instead of interface, as classes can be declared as variable members of classes... but, again, there is a huge limitation: The class variable is in fact acting a a final or const variable that cannot be changed, thus making its usage almost useless...
But I knew that HLSL permits lots of unusual constructions, and this is where closures are going to resolve this.
Hacking Closures in HLSL
So we know that interfaces can be used as function pointers, but their usage is limited as we cannot use anykind of composition. An interesting fact is that we can declare local variables in methods as being class or interfaces... The trick is to use a quite uncommon feature of HLSL: It is possible to declare local classes inside a method, that can access local parameters! Therefore, It is possible to use a kind of deferred composition/aggregation using this technique. Let's rewrite our noise functions using this new closure technique:
1. Declare a INoise interface that is able to compute the noise by using a next INoise implementation.
// It is possible to compile this code under ps_4_0 and ps_3_0 // Declare our INoise interface interface INoise { // Here an interesting hack: We can declare a method that is returning a INoise // interface. This method will be implemented by the pixel shaders. INoise Next(); // The compute method of a Noise float Compute(float2 pos); };
2. Declare NoiseBase as an abstract implementation of INoise that is implementing the methods. If we had the keyword abstract in hlsl we wouldn't have to implement methods of this class.
// We are creating an abstract class from INoise in order // to implement both methods class NoiseBase : INoise { INoise Next() { // This code will never be used. It is only // used to declare this class NoiseBase base; return base; } float Compute(float2 pos) { // This code will never be used. It is only // used to declare this class return Next().Compute(pos); } };
3. Use NoiseBase to implement final INoise functions. If you look at AbsNoise, FbmNoise or MarbleNoise, they are using the INoise::Next() method to get an instance of the INoise interface they rely on. This is where functions pointers are extremely useful here.
// PerlinNoise implem class PerlinNoise : NoiseBase { float Compute(float2 pos) { // call a standard perlin_noise implemented as a simple external function return perlin_noise(pos); } }; // AbsNoise implem class AbsNoise : NoiseBase { float Compute(float2 pos) { // Note: We are using Next to access the next underlying function pointer return abs(Next().Compute(pos)); } }; // FbmNoise implem class FbmNoise : NoiseBase { float Compute(float2 pos) { float value = 0.0f; float amplitude = 1.0f; float frequency = InitialFrequency; for ( int i=0; i < Octaves; i++ ) { float noiseValue = Next().Compute(pos); value += amplitude * noiseValue; frequency *= Lacunarity; amplitude *= Amplitude; } return value; } }; // MarbleNoise implem class MarbleNoise : NoiseBase { float Compute(float2 pos) { return stripes(2 * Next().Compute(pos, frequency), 1.6f); } static float stripes(float x, float f) { float t = .5 + .5 * sin(f * 2*PI * x); return t * t - .5; } };
4. Implements the pixel shaders with the closure mechanism. We are declaring local classes that will override INoise::Next() method in order to chain INoise function pointers together.
// Fbm -> PerlinNoise float FbmPerlinNoise2DPS( float4 pos : SV_POSITION, float2 texPos : TEXCOORD0 ) : SV_Target { // Look! We are declaring a local class class Noise1 : PerlinNoise {} noise1; // and this local classs can access local variable! // For example, Noise2 can access previous noise1 variable. class Noise2 : FbmNoise { INoise Next() { return noise1; } } noise2; // Allowing us to cascade the calls and making a kind of deferred composition. return noise2.Compute(texPos); } // Fbm -> Abs -> PerlinNoise float FbmAbsPerlinNoise2DPS( float4 pos : SV_POSITION, float2 texPos : TEXCOORD0 ) : SV_Target { class Noise1 : PerlinNoise {} noise1; class Noise2 : AbsNoise { INoise Next() { return noise1; } } noise2; class Noise3 : FbmNoise { INoise Next() { return noise2; } } noise3; // FbmNoise is calling indirectly AbsNoise that will call PerlinNoise. return noise3.Compute(texPos); } // Marble -> Fbm -> Abs -> PerlinNoise float FbmAbsPerlinNoise2DPS( float4 pos : SV_POSITION, float2 texPos : TEXCOORD0 ) : SV_Target { class Noise1 : PerlinNoise {} noise1; class Noise2 : AbsNoise { INoise Next() { return noise1; } } noise2; class Noise3 : FbmNoise { INoise Next() { return noise2; } } noise3; class Noise4 : MarbleNoise { INoise Next() { return noise3; } } noise4; // MarbleNoise is calling FbmNoise that is calling indirectly AbsNoise // that will call PerlinNoise. return noise4.Compute(texPos); } // Fbm -> Marble -> Abs -> PerlinNoise float FbmAbsPerlinNoise2DPS( float4 pos : SV_POSITION, float2 texPos : TEXCOORD0 ) : SV_Target { class Noise1 : PerlinNoise {} noise1; class Noise2 : AbsNoise { INoise Next() { return noise1; } } noise2; class Noise3 : MarbleNoise { INoise Next() { return noise2; } } noise3; class Noise4 : FbmNoise { INoise Next() { return noise3; } } noise4; // FbmNoise is calling MarbleNoise that is calling indirectly AbsNoise // that will call PerlinNoise. return noise4.Compute(texPos); }Et voila! As you can see, we are able to declare local classes from a pixel shader that are acting as closures. It is for example even possible to declare local classes that have a specific code in their Compute() methods.
Behind the scene, when chaining the INoise::Next() methods, the fxc HLSL compiler is seeing all thoses classes as "INoise*".
It is then possible to perform a fbm(marble(abs(perlin_noise()))) as well as a marble(fbm(abs(perlin_noise()))).
In the end, It is effectively possible to implement closures in HLSL that can be used in SM4.0 as well as SM3.0!
Improving closures chaining
From the previous example, we can extend the concept by
1. Adding static local constructors to each Noise function :
// PerlinNoise implem class PerlinNoise : NoiseBase { float Compute(float2 pos) { // call a standard perlin_noise implemented as a simple external function return perlin_noise(pos); } // Add local "constructor" static INoise New() { PerlinNoise noise; return noise; } }; // AbsNoise implem class AbsNoise : NoiseBase { float Compute(float2 pos) { // Note: We are using Next to access the next underlying function pointer return abs(Next().Compute(pos)); } // Add local constructor and chain with From INoise static INoise New(INoise from) { class LocalNoise : AbsNoise { INoise Next() { return from; } } noise; return noise; } }; // Add the same constructors to FbmNoise and MarbleNoise. // ....2. And then we can rewrite the Pixel shader functions to chain operators in a shorter form:
// Fbm -> Marble -> Abs -> PerlinNoise float FbmAbsPerlinNoise2DPS( float4 pos : SV_POSITION, float2 texPos : TEXCOORD0 ) : SV_Target { // FbmNoise is calling MarbleNoise that is calling indirectly AbsNoise // that will call PerlinNoise. return FbmNoise::New(MarbleNoise::New(AbsNoise::New(PerlinNoise::New()))).Compute(texPos); }
This way, It allows a syntax that is even more concise and modular!
Further Considerations
This is a very exciting technique that could open lots of abstraction opportunities while developing in HLSL. Though, in order to use this technique, there are a couple of advantages and things to take into account:
- An interface cannot inherit from another interface (that would be really interesting)
- An interface can only have method members.
- A class can inherit from another class and from several interfaces.
- Unlike in C/C++, we cannot pre-declare an interface, but we can use a declaration being declared (See the example of the method INoise::Next, returning a INoise).
- The compiler has a limitation against the reuse of an implementation in a call chain and will complain about a recursive call (even if there is no recursive call at all): For example, It is not possible to reuse twice the sample type of class closure in a call chain, meaning that it is not possible to make a call chain like this one: Marble => FBM => Marble => Abs => Perlin. The fxc compiler would complain about the second "Marble" as It would see it as a kind of recursive call. In order to reuse a function, we need to duplicate it, that's probably the only point that is annoying here.
- Generated compiled asm output from closures are exactly the same as using standard inlining methods.
- Before going to local class-closure, I have tried several techniques that were sometimes crashing fxc compiler.
- Thus, as it is a way of hacking the usage HLSL, It is not guarantee that this will be supported in the future. But at least, if it is working for SM5.0, SM4.0 and 3.0, we can expect that we are safe for a while!
- Also, the compilation time under vs_3_0/ps_3_0 profile seems to take more time, not sure if its the language construction or a regular behavior of 3.0 profiles.
Very clever! Now on to try it on Cg :)
ReplyDeleteAt first I thought: mixing interfaces and abstract classes != closures, then read on and it hit me. I approve.
ReplyDeleteYou can declare a new class within a function!? Very nice indeed! I didn't know HLSL allowed this. I have been working on a different approach for using higher order functions. A library I am working on translates F# to HLSL. Check it out: https://github.com/rookboom/SharpShaders/wiki/Higher-order-functions
ReplyDeleteLove this. I'm using it in my deferred shading engine, to select the materials based on an index, so i don't have to loop trough each option. Really nice indeed
ReplyDelete