2

We have something that looks like this:

abstract class TreeNode {
    internal abstract Branches GetChildren();   
}

class Leaf : TreeNode {
    internal override Branches GetChildren() => Branches.Empty();
}

class UnaryTreeNode : TreeNode {
    TreeNode child;
    internal override Branches GetChildren()
    {
    }
}

class BinaryTreeNode: TreeNode {
    TreeNode left;
    TreeNode right;
    internal override Branches GetChildren()
    {
    }
}

class VariadicTreeNode : TreeNode {
    TreeNode[] nodes;
    internal override Branches GetChildren()
    {
    }
}

These are still base classes; literally hundreds of classes inherit from the classes shown here.

So the obvious simple implementation is Branches is IReadOnlyList<TreeNode>; but this imposes an allocation penalty. We suppose it would be possible to reduce the number of allocations on descending the tree to 0 by some means. I chased returning a ReadOnlyMemory<TreeNode> but that isn't possible even by non-public reflection because the anchor of a ReadOnlyMemory<T> isn't actually arbitrary.

I turned around and chased a ReadOnlySpan<T> and found a plausible approach: MemoryMarshal.CreateReadOnlySpan<T>. If I understand the comments correctly; the resulting Span from MemoryMarshal doesn't old a reference to its container, so the trivial implementation does not work.

So I'm left wondering: does the following work:

ref struct Branches {
    internal object Anchor {get;init;}
    internal ReadOnlySpan<TreeNode> Span {get;init;}
}

abstract class TreeNode {
    internal abstract Branches GetChildren();   
}

class Leaf : TreeNode {
    internal override Branches GetChildren() => default;
}

class UnaryTreeNode : TreeNode {
    TreeNode child;
    internal override Branches GetChildren()
    {
        var span = System.Runtime.InteropServices.MemoryMarshal.CreateReadOnlySpan(ref child, 1);
        return new Branches { Anchor = this, Span = span };
    }
}

class BinaryTreeNode: TreeNode {
    TreeNode left;
    TreeNode right;
    internal override Branches GetChildren()
    {
        var span = System.Runtime.InteropServices.MemoryMarshal.CreateReadOnlySpan(ref left, 2);
        return new Branches { Anchor = this, Span = span };
    }
}

class VariadicTreeNode : TreeNode {
    TreeNode[] nodes;
    internal override Branches GetChildren()
    {
        return new Branches { Anchor = nodes, Span = nodes };
    }
}

A simple trial will show it working; however I can't validate if this code is sane directly: what happens when the object is moved by the garbage collector?

I have benchmarked the Span implementation; it's between 10% and 20% faster than the obvious array implementation.

11
  • 1
    As I understand it, the JIT is under no obligation to put left and right in adjacent locations in the memory allocated for the BinaryTreeNode class. Assuming that it will do that means you're relying on an implementation detail that could change without notice to you, and accessing index 1 suddenly gives you something other than right (possibly not even an object reference). Look into [InlineArray]. Commented Oct 20 at 17:34
  • @madreflection: Yeah I was wondering about that; I'm pretty sure it works if I put them into a struct. Commented Oct 20 at 17:54
  • 1
    On its own, as long as the struct's layout is specifically sequential, that might work but I still wouldn't rely on that. Using [InlineArray] will yield reliable adjacency because it's specifically intended to be array-like. Commented Oct 20 at 18:13
  • 1
    what happens when the object is moved by the garbage collector? - take a look at How does a Span survive garbage collection?. And following up on @madreflection's point, making a span from an [InlineArray] of length 2 is probably more sensible than assuming anything about field ordering. See e.g. Workaround on declaring a unsafe fixed custom struct array?. Commented Oct 20 at 20:56
  • 1
    You don't need to use MemoryMarshal.CreateReadOnlySpan() at all. For UnaryTreeNode you can make a 1-element span directly from a ref value like so: Span = new(ref child). And if you use an inline array you can make a span directly from it, see dotnetfiddle.net/MMAzrp. And now, since you're never using unsafe code, MSFT is guaranteeing that GC references are properly tracked. Commented Oct 20 at 21:57

1 Answer 1

4

Fundamentally, you can validate that your code is sane because the .NET runtime guarantees that is is sane. As long as you are not calling any unsafe methods and are exclusively using types and methods that are documented to be memory-safe, your code should be safe whenever the garbage collector compacts memory. And Microsoft specifically documents that Span<T> and ReadOnlySpan<T> are memory-safe:

Span<T> Struct

Provides a type-safe and memory-safe representation of a contiguous region of arbitrary memory.

As to how that was implemented, you can take a look at How does a Span survive garbage collection?.

That being said, you seem to have some uncertainty about the memory safety of some of the methods of the MemoryMarshal type. Luckily you can completely avoid using this type:

  1. In UnaryTreeNode you can use the Span(ref T reference) constructor to directly construct your span from your TreeNode child; like so: Span = new(ref child).

    This constructor is not marked as unsafe, so it's safe.

  2. In BinaryTreeNode you can replace your two fields with an inline array of two values[1]:

    class BinaryTreeNode: TreeNode {
        [System.Runtime.CompilerServices.InlineArray(2)] // Define a struct for an inline array of TreeNode elements of length 2
        struct TwoElementBuffer { private TreeNode _element0; }
        TwoElementBuffer values;
        internal override Branches GetChildren() => new () { Anchor = this, Span = values };
    }
    

    Inline arrays were introduced in C# 12 / .NET 8. As explained in the inline array specification:

    Runtime provides regular GC tracking for all elements in the [inline array] struct.

    Language will provide a type-safe/ref-safe way for accessing elements of inline array types. The access will be span based. This limits support to inline array types with element types that can be used as a type argument.

    Using an inline array also guarantees that the left and right nodes are mapped to the [0] and [1] elements of the Span, which your current implementation probably does but doesn't necessarily do (because the runtime is free to lay out fields in any order).

    Update

    While not marked as unsafe, the method you are using here, MemoryMarshal.CreateReadOnlySpan<T>(scoped ref T reference, int length), does have a documented safety warning:

    This method can be useful if part of a managed object represents a fixed array.

    Warning

    This method should be used with caution. It is dangerous because the length argument is not checked. Even though the ref is annotated as scoped, it will be stored into the returned span, and the lifetime of the returned span will not be validated for safety, even by span-aware languages.

    And in fact I was able to generate an AccessViolationException by returning a Span with a length much larger than 2, see here.

    Inline arrays are specifically designed to handle this situation correctly.

Putting all that together, your code can be simplified and made more "obviously" safe as follows:

ref struct Branches {
    internal object Anchor {get;init;}
    internal ReadOnlySpan<TreeNode> Span {get;init;}
}

abstract class TreeNode {
    internal abstract Branches GetChildren();   
}

class Leaf : TreeNode {
    internal override Branches GetChildren() => default;
}

class UnaryTreeNode : TreeNode {
    TreeNode child;
    internal override Branches GetChildren() => new () { Anchor = this, Span = new(ref child) };
}

class BinaryTreeNode : TreeNode {
    [System.Runtime.CompilerServices.InlineArray(2)] // Define a struct for an inline array of TreeNode elements of length 2
    struct TwoElementBuffer { private TreeNode _element0; }
    TwoElementBuffer values;
    internal override Branches GetChildren() => new () { Anchor = this, Span = values };
}

class VariadicTreeNode : TreeNode {
    internal TreeNode[] Nodes { get; init; }
    internal override Branches GetChildren() => new () { Anchor = Nodes, Span = Nodes };
}

Demo .NET 8 fiddle here.


[1] As suggested by madreflection in comments.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.