1

I'm running math heavy code on GPU with ILGPU, I organized it in modular classes segregated by interfaces to implement different algorithms that are modular.

Now ILGPU allows only to run static methods on GPU kernels. So I need to take the IL out of those nested classes, and inline manually, ouputting a single static method that will be called as Kernel.

In example

public interface IFunction
{
    public float DoSum(float[] input);
}


public class NormalSum
{
    public float(float[] input)
    {
        float sum=0.0f;
        for(int i=0;i<input.Lenght;i++)
            sum+=input[i];
    }
}

public interface IAlgorithm
{
    public void DoAlgorithm(float[] input, float[] output);
}

public class MyAlgorithm
{
    IFunction fun;
    public MyAlgorithm(IFunction fun)
    {
        this.fun = fun;
    }
    public void DoAlgorithm(float[] input, float[] output)
    {
          for(int j=0;j<output.Length;j++)
          {
               output[j] = 2.0f* fun.DoSum(input);
          }
    }
}

There are no stored variables, the classes and interfaces are just wrappers over algorithms so I need a code and expect it to inline the methods to something like that without bothering to resolve eventual references to instance members, i look to "fun" just to extract the method implementation, then "fun" can be discarded.

var staticMethod = Inline(new MyAlgorithm(new NormalSum));

 // generated IL
public static void GeneratedCode(float[] input, float[] output)
{
          for(int j=0;j<output.Length;j++)
          {
              float sum=0.0f;
              for(int i=0;i<input.Length;i++)
                  sum+=input[i];
              float result = sum;
              output[j] = 2.0f* result;
          }
}

So far i just copied a single static method, but was not able to go "inside the implementation".

    MethodInfo existingMethod = typeof(MyStaticClass).GetMethod("MethodToCopy", BindingFlags.Static | BindingFlags.Public);

    AssemblyName assemblyName = new AssemblyName("DynamicAssembly");
    AssemblyBuilder assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(assemblyName, AssemblyBuilderAccess.Run);

    ModuleBuilder moduleBuilder = assemblyBuilder.DefineDynamicModule("DynamicModule");

    TypeBuilder typeBuilder = moduleBuilder.DefineType("DynamicClass", TypeAttributes.Public);

    MethodBuilder methodBuilder = typeBuilder.DefineMethod("DynamicMethod", MethodAttributes.Public | MethodAttributes.Static, existingMethod.ReturnType, new Type[] { });

    ILGenerator ilGenerator = methodBuilder.GetILGenerator();
    
    // Copy IL code
    foreach (var instruction in existingMethod.GetMethodBody().GetILAsByteArray())
    {
        ilGenerator.Emit(instruction);
    }
    
    Type type = typeBuilder.CreateType();

    type.GetMethod("DynamicMethod").Invoke(null, null);
2
  • 1
    I find the typos input.Lenght and output.Lenght distracting. ouput less so, but still...give your code the same amount of care and attention that you want potential answers to give. Commented Mar 27, 2024 at 17:16
  • @Wyck Not everybody is a native speaker. Things like this can literally never come to your mind as wrong even with decent english education and fluidity. No fault in pointing it out - I wrote "Extention" for several years before pointed out - but don't just assume carelessness or lazyness just because one misspelled some words. Commented Mar 27, 2024 at 18:42

2 Answers 2

1

If you can use C# 11+ then you can try using the static abstract interface members and dump the reflection completely:

Declare the interfaces with static methods:

public interface IAlgorithm1
{
    public static abstract void DoAlgorithm(float[] input, float[] ouput) ;
}

public interface IFunction1
{
    public static abstract float DoSum(float[] input);
}

And then implement them:

public class MyAlgorithm1<T> : IAlgorithm1 where T : IFunction1
{
    public static void DoAlgorithm(float[] input, float[] output)
    {
        for(int j=0;j<output.Length;j++)
        {
            output[j] = 2.0f* T.DoSum(input); // call to the static method of the generic type
        }
    }
}

public class NormalSum1 : IFunction1
{
    public static float DoSum(float[] input)
    {
        float sum = 0.0f;
        for (int i = 0; i < input.Length; i++)
            sum += input[i];
        return sum;
    }
}

And then you can access the static method via MyAlgorithm1<NormalSum1>.DoAlgorithm. For example:

var output = new float[1];
MyAlgorithm1<NormalSum1>.DoAlgorithm(new []{1f}, output);

but GPU Kernels are not allowed to have a method call inside

Options I see:

  • Switching to struct's instead of classes (+ applying hints)
  • Use Rolsyn source generators to just generate the needed inlined code here (for example you can generate code for some static Generator.Call(MyAlgorithm.DoAlgorithm, new NormalSum()) method).
Sign up to request clarification or add additional context in comments.

5 Comments

That solution would be great and I alredy tried, but GPU Kernels are not allowed to have a method call inside (except for math functions like Sin,Cos,Sqrt etc.). theorically CUDA would allow that, but ILGPU has not reached that point yet
@CoffeDeveloper It is hard to tell (do not know how this ) but maybe compiler hints like AggressiveInlining and AggressiveOptimization could help. Either way using Reflection.Emit here would be tedious and brittle. Personally I would try to look into Roslyn + source generators so you can generate actual code. It will make your life much easier.
Another trick you can try - make struct's instead of classes here. Maybe compiler will be smart enough to inline.
tried with inlining, nothing. I would like a way that is not reflection, though I'm used to ti. I also made a feature request to ILGPU waiting for response.
@CoffeDeveloper see the link to sharplab.io. Try inlining + switching to structs i.e. class MyAlgorithm1<T> -> struct MyAlgorithm1<T> and the same for NormalSum1.
1

You can make each of those interfaces return a static Func rather than implicitly creating that delegate yourself.

public interface IFunction
{
    public Func<float[], float> GetDoSum();
}


public class NormalSum : IFunction
{
    public Func<float[], float> GetDoSum() => DoSum;

    public static float DoSum(float[] input)
    {
        float sum = 0.0f;
        for(int i=0; i < input.Length; i++)
            sum += input[i];
    }
}
public interface IAlgorithm
{
    public Action<float[], float[]> GetDoAlgorithm();
}

public class MyAlgorithm : IAlgorithm
{
    Func<float[], float> fun;

    public MyAlgorithm(IFunction fun)
    {
        this.fun = fun.GetDoSum();
    }

    public Action<float[], float[]> GetDoAlgorithm() => DoAlgorithm();

    public static void DoAlgorithm(float[] input, float[] output)
    {
          for(int j = 0; j < output.Length; j++)
          {
               output[j] = 2.0f * fun(input);
          }
    }
}

I must say, I don't know much about ILGPU, but what you are doing looks rather different from the documentation, which seems to indicate you are supposed to work with ArrayView and Kernel functions, not loops.

3 Comments

I would guess that this will have the same problem as my answer (see the comments) - "but GPU Kernels are not allowed to have a method call inside (except for math functions like Sin,Cos,Sqrt etc.). theorically CUDA would allow that, but ILGPU has not reached that point yet"
Like I said, I don't think this whole thing looks like it makes much sense to use in ILGPU anyway although I admit I have no experience.
Kernels are body of loops, but no One prevent adding loops inside kernels, but It was Just a example. Your code won't work with ILGPU

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.