0

I'm constructing a "scenegraph", which is a hierarchical data structure of Shape nodes (e.g. sphere, cube, mesh, etc. not shown in example code). A Shape can own zero or more child Shape instances. This forms a tree, which is built fully and then passed immutably to a set of worker threads, that render separate parts of the image based on this scenegraph. There is no requirement for the scenegraph to change once it is built, so the threads only need read access.

Example

My question involves the construction of the scenegrape. Initially, I had Shapes that own sub-Shapes. I.e. A Shape owns all of its children. Therefore I was not using pointers or references. This worked fine until I needed to add parent pointers.

Each child of a Shape must now maintain a pointer to its parent. I went down the path of using Arc (because eventually the scenegraph is shared between threads), and that let's me use a Weak ptr to a parent to avoid the cycle problem. But the implication of this is that the primary owning relationship, where Shapes own sub-Shapes, now has to be an Arc too, and this pointer is traversed many, many more times than the parent pointer.

It is my understanding that the features of Arc are only necessary during construction of this data structure and handing over to the thread pool. During rendering these Arc pointers is unfortunate overhead. The data structure is considered immutable at this point, so the features of Arc are no longer needed. They were only required during the building of the data structure, to support the creation of Weak parent pointers.

Interestingly, I couldn't find a way in safe code to set a child as owned by a Shape and set the parent pointer in the child. It's a kind of doubly-linked list, which I know is difficult in Rust. In the end I had to use a single line of unsafe to set the pointer in Shape::set_parent() below:

use std::sync::{Arc, Weak};

pub trait LocalIntersect {
    fn local_intersect(&self);
}

#[derive(Debug, Default)]
pub struct Shape {
    id: usize,
    members: Vec<Arc<Shape>>,
    parent: Option<Weak<Shape>>,
}

impl Shape {
    fn new(id: usize) -> Self {
        Self { id, members: vec![], parent: None }
    }

    fn add(&mut self, shape: Arc<Shape>) {
        self.members.push(shape);
    }

    fn set_parent(&self, parent: Weak<Shape>) {
        let this = self as *const Shape as *mut Shape;
        unsafe {
            (*this).parent = Some(parent);
        }
    }

    fn set_parent_to_all(&self, parent_weak: Weak<Shape>) {
        for member in &self.members {
            member.set_parent(parent_weak.clone());
        }
    }
}

impl LocalIntersect for Shape {
    fn local_intersect(&self) {
        // For now, check if there is a parent and if so, print its ID
        // (Later, we'll combine parent transforms, but not now)
        if let Some(parent_weak) = &self.parent {
            if let Some(parent) = parent_weak.upgrade() {
                println!("local_intersect {}, parent {}", self.id, parent.id);
            }
        }
    }
}

fn main() {
    let mut parent_shape = Shape::new(0);
    let child_shape = Shape::new(1);

    let child_arc = Arc::new(child_shape);

    // Add child to the parent's group
    parent_shape.add(Arc::clone(&child_arc));

    // Wrap the parent shape in Arc
    let parent_arc = Arc::new(parent_shape);

    // Set the parent pointer of each child in the group
    parent_arc.set_parent_to_all(Arc::downgrade(&parent_arc));

    // Share across threads (example with a thread)
    let shared_node = Arc::clone(&child_arc);
    let thread = std::thread::spawn(move || {
        shared_node.local_intersect();
    });
    thread.join().unwrap();

    // Output is:
    //   local_intersect 1, parent 0
}

Rust Playground

Given performance is a top priority for this application, is there an idiomatic way to handle this in Rust? Is this a good candidate for unsafe code? Should I drop Arc and Weak and just use normal child ownership and a raw pointer for the parent pointer? If so, how would I handle the dropping of this data structure later?

I could use something like id_tree or petgraph however those will also provide an overhead when looking up children or parents. I really just want a pointer that the renderer can directly follow back to the parent, without affecting the performance of the very common "get child" look-up.

I'm not new to Rust or C/C++ but I am new to writing unsafe Rust code.

EDIT: For example, in a C program I’d just add a raw parent pointer to all sub-Shapes prior to rendering, and null them all out afterwards before deallocating any Shapes, and it wouldn’t affect ownership or cause a dangling pointer problem. I’m looking for a similar process in Rust - just a nice, simple, fast parent pointer in an eventually immutable data structure, but without the chicken-and-egg issues.

24
  • 4
    As with any unsafe you should run this under miri (top right on the playground under tools). Your code contains UB! It modifies code which is pointed at by a shared reference, you can't do that. Commented Nov 18, 2023 at 22:25
  • Well, you can if you can guarantee that the modification only ever happens in a single-threaded context, before it is handed off to the thread pool. It’s not UB unless called in a multithreaded context, yeah? Admittedly the unsafe function should document that. Anyway, I’d rather find a safe way to do it, I’m just not sure if there is one. Commented Nov 18, 2023 at 23:59
  • 3
    No, it's still not safe. The only safe mechanism for mutating through a shared reference is UnsafeCell. The compiler is free to optimize as though your writes through the self as *const Shape as *mut Shape pointer never happened. However, you could trivially fix this by using RefCell<Weak<Shape>>. Commented Nov 19, 2023 at 1:42
  • 1
    @davidA Right. There's a few ways you can accomplish that. Let me think on it a bit and then I can write an answer. Commented Nov 19, 2023 at 1:52
  • 1
    Besides that, I think that arenas are particularly suited to your problem if you really need to squeeze performance, because not only they allocate contiguous chunks of memory, which is cache-friendly, but they actually allocate and deallocate through the memory allocator once, and their internal allocating strategy is very well suited for situations in which you start by allocating a lot of objects, then use them, then deallocate them all (as is, for instance, in compiler passes, or in rendering software). Commented Nov 19, 2023 at 8:01

1 Answer 1

1

If you really consider the tree as immutable once built, maybe could you pack all the nodes in a vector and work with indices instead of references (maybe this is called an arena, not sure if there is a subtlety). We have to be extra-careful with the indices when building the tree, but this can be totally encapsulated in the implementation (we never have to assign an id ourselves). In a multithreaded context, we just have to share/pass the vector of nodes as a whole.

Here is your example slightly transformed with this idea in mind. (I didn't understand the local_intersect() function, maybe it does not do what you want).

use std::sync::Arc;

mod shape_tree {

    #[derive(Debug)]
    pub struct Shape {
        id: usize,
        members: Vec<usize>,
        parent_id: usize,
    }

    impl Shape {
        pub fn new(
            shapes: &mut Vec<Self>,
            parent_id: Option<usize>,
        ) -> usize {
            let id = shapes.len();
            shapes.push(Self {
                id,
                members: Vec::new(),
                parent_id: usize::MAX,
            });
            if let Some(parent_id) = parent_id {
                shapes[parent_id].members.push(id);
                shapes[id].parent_id = parent_id;
            };
            id
        }

        pub fn id(&self) -> usize {
            self.id
        }

        pub fn members(&self) -> &[usize] {
            &self.members
        }

        pub fn parent<'a>(
            &self,
            shapes: &'a [Self],
        ) -> Option<&'a Self> {
            if self.parent_id == usize::MAX {
                None
            } else {
                Some(&shapes[self.parent_id])
            }
        }
    }
}

pub trait LocalIntersect {
    fn local_intersect(
        &self,
        shapes: &[shape_tree::Shape],
    );
}

impl LocalIntersect for shape_tree::Shape {
    fn local_intersect(
        &self,
        shapes: &[shape_tree::Shape],
    ) {
        // For now, check if there is a parent and if so, print its ID
        // (Later, we'll combine parent transforms, but not now)
        if let Some(parent) = self.parent(shapes) {
            println!("local_intersect {}, parent {}", self.id(), parent.id());
        }
    }
}

fn main() {
    let mut shapes = Vec::new();
    let parent_id = shape_tree::Shape::new(&mut shapes, None);
    let child_id = shape_tree::Shape::new(&mut shapes, Some(parent_id));
    let shapes = Arc::new(shapes.into_boxed_slice()); // seal the vector

    let thread = std::thread::spawn({
        let shapes = Arc::clone(&shapes);
        move || {
            println!("in thread");
            shapes[child_id].local_intersect(&shapes);
        }
    });
    thread.join().unwrap();

    println!("back to main");
    shapes[child_id].local_intersect(&shapes);
}
/*
in thread
local_intersect 1, parent 0
back to main
local_intersect 1, parent 0
*/
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the answer. local_intersect() was just a hold-over from my code, and in my example I just had it demonstrating that the parent pointer is available and usable when running a function on a Shape. I like this arena idea a lot, but I am a bit concerned about the extra indirection that this introduces - doesn't each access into the arena involve both a pointer to the vector, and then pointer arithmetic to the offset? I do wonder in practice how much of an effect that would actually have though, and it might be completely mitigated by proximity keeping the CPU cache warm. Worth a try!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.