2

Still doing my F# performance testing and trying to get stack based arrays working. For some more background see here: f# NativePtr.stackalloc in Struct Constructor.

As I understand it, each function call should get its own frame in the stack. This memory is then freed upon return by moving the stack pointer back. However the below causes a stack overflow error - not sure why as the stackalloc is performed inside a function.

Interestingly this only happens in Release mode, not Debug mode.

I believe the standard stack size in dotnet is 1MB and I haven't adjusted mine. I would expect an allocation of 8192 ints (32768 bytes) not to blow the stack.

#nowarn "9"

module File1 =

    open Microsoft.FSharp.NativeInterop
    open System
    open System.Diagnostics    

    let test () =
        let stackAlloc x =
            let mutable ints:nativeptr<int> = NativePtr.stackalloc x
            ()

        let size = 8192            
        let reps = 10000
        let clock = Stopwatch()
        clock.Start()
        for i = 1 to reps do            
            stackAlloc size
        let elapsed = clock.Elapsed.TotalMilliseconds
        let description = "NativePtr.stackalloc"
        Console.WriteLine("{0} ({1} ints, {2} reps): {3:#,##0.####}ms", description, size, reps, elapsed)

    [<EntryPoint>]
    let main argv = 
        printfn "%A" argv
        test ()
        Console.ReadKey() |> ignore
        0

UPDATE After decompiling with ILSpy as suggested by Fyodor Soikin, we can see that inlining has taken place during optimisation. Kinda cool, and kinda scary!

using Microsoft.FSharp.Core;
using System;
using System.Diagnostics;
using System.IO;

[CompilationMapping(SourceConstructFlags.Module)]
public static class File1
{
    public unsafe static void test()
    {
        Stopwatch clock = new Stopwatch();
        clock.Start();
        for (int i = 1; i < 10001; i++)
        {
            IntPtr intPtr = stackalloc byte[8192 * sizeof(int)];
        }
        double elapsed = clock.Elapsed.TotalMilliseconds;
        Console.WriteLine("{0} ({1} ints, {2} reps): {3:#,##0.####}ms", "NativePtr.stackalloc", 8192, 10000, elapsed);
    }

    [EntryPoint]
    public static int main(string[] argv)
    {
        PrintfFormat<FSharpFunc<string[], Unit>, TextWriter, Unit, Unit> format = new PrintfFormat<FSharpFunc<string[], Unit>, TextWriter, Unit, Unit, string[]>("%A");
        PrintfModule.PrintFormatLineToTextWriter<FSharpFunc<string[], Unit>>(Console.Out, format).Invoke(argv);
        File1.File1.test();
        ConsoleKeyInfo consoleKeyInfo = Console.ReadKey();
        return 0;
    }
}

Further to this, the following may be of interest:

http://www.hanselman.com/blog/ReleaseISNOTDebug64bitOptimizationsAndCMethodInliningInReleaseBuildCallStacks.aspx

Also optimization can be tweaked using attributes:

https://msdn.microsoft.com/en-us/library/system.runtime.compilerservices.methodimploptions(v=vs.110).aspx?cs-save-lang=1&cs-lang=fsharp#code-snippet-1

2
  • A little odd here - it is on the 256th alloc that causes the problem. Commented Feb 18, 2016 at 4:00
  • Wierd, guess the stack isn't 1mb. Commented Feb 18, 2016 at 5:04

1 Answer 1

4

This would happen if your stackAlloc function was inlined, thus causing stackalloc to happen within the test's frame. This also explains why it would only happen in Release: inlining is a kind of optimization that would be performed much less aggressively in Debug than Release.

To confirm this, I would try looking at your resulting code with ILSpy.

Why do you need to use stack-allocated arrays in the first place? This looks exactly like the kind of thing that Donald Knuth warned us about. :-)

Sign up to request clarification or add additional context in comments.

2 Comments

Seriously though I'm trying to write very low latency socket code. I'm doing benchmarking on various languages, programming paradigms, data structures and memory management techniques to see how their performance levels compare. So far F# is standing up quite well, though I have had to abandon some of the idiomatic programming style - dont use sequences for iteration.
Thanks very much for your suggestion, the decompiled version using ILSpy shows that inlining has taken place. Stack Overflow is awesome and so are you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.