0

I am working on a project that uses Boost.Interprocess to manage shared memory and pybind11 to expose a C++ class to Python. The goal is to access a NumPy array as a view into the shared memory without copying the data. However, when the shared memory manager is destroyed, accessing the array causes a segmentation fault.

Here is a minimal "working" example:

C++ Code:

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <boost/interprocess/managed_shared_memory.hpp>
#include <boost/interprocess/allocators/allocator.hpp>
#include <boost/interprocess/containers/vector.hpp>
#include <string>

namespace py = pybind11;
namespace bip = boost::interprocess;

typedef bip::allocator<float, bip::managed_shared_memory::segment_manager> ShmemAllocator;
typedef bip::vector<float, ShmemAllocator> MyVector;

class SharedMemoryManager {
public:
    SharedMemoryManager(const std::string& name, size_t size) 
        : shm_name(name), shm_size(size), shm(bip::open_or_create, name.c_str(), size) {
        
        allocator = new ShmemAllocator(shm.get_segment_manager());
        vector = shm.find_or_construct<MyVector>("MyVector")(*allocator);
        
        if (vector->empty()) {
            vector->resize(100);
            for (size_t i = 0; i < 100; ++i) {
                (*vector)[i] = static_cast<float>(i);
            }
        }
    }

    ~SharedMemoryManager() {
        delete allocator;
        bip::shared_memory_object::remove(shm_name.c_str());
    }

    py::array_t<float> get_array() {
        return py::array_t<float>(
            {static_cast<ssize_t>(vector->size())},
            {sizeof(float)},
            vector->data(),
            py::capsule(this, [](void *) { /* Empty deleter */ })
        );
    }

private:
    std::string shm_name;
    size_t shm_size;
    bip::managed_shared_memory shm;
    ShmemAllocator* allocator;
    MyVector* vector;
};

PYBIND11_MODULE(segfault_example, m) {
    py::class_<SharedMemoryManager>(m, "SharedMemoryManager")
        .def(py::init<const std::string&, size_t>())
        .def("get_array", &SharedMemoryManager::get_array);
}

Python code

import segfault_example

manager = segfault_example.SharedMemoryManager("MySharedMemory", 65536)  # 64KB shared memory

arr = manager.get_array()
print("Array before destruction:", arr[:5])  # This works fine

del manager

# This causes a segmentation fault
print(arr)

Issue

When the manager is deleted (del manager), it destroys the shared memory, and subsequent access to the NumPy array arr causes a segmentation fault, as the memory it refers to has been removed.

Question

Obviously this is an unwanted situation. So I would like to figure out if either the following is possible without copy-on-read when the shared memory is still available:

  • Is there a way to invalidate the numpy array when the shared memory gets destroyed? I am able to track the reference counter in python, so I know which of those has been assigned to a variable in python.
  • Based on the reference counter I would be able to copy over the shared memory to local memory before destruction of the shared memory. If I would be able to change the pointer to the numpy array this would also be a suitable solution.
  • Anything else I have not thought of.

What I have tried.

I tried to reassign the data pointer to the array by tracking the pointers in a global variable, but that doesn't seem to have any effect as numpy (probably?) dereferences. Perhaps this can be done with more low level python calls?

11
  • Not sure if I understood, you are asking "(...)is possible without copy-on-read when the shared memory is still available", but then you describe the situation when the shared memory is not anymore available. I you want to use your data after del manager, then it has to be copied before. Here is a related topic stackoverflow.com/a/44682603/4165552 not sure if that's what you are looking for. Commented Sep 11, 2024 at 10:18
  • To me it sounds like : I still want to access my data (shared memory) after I said I don't need it anymore. arr = manager.get_array() this will just create a "view" on your share memory not a copy. So maybe get_array in C++ should return a struct with python array + a reference counted something to the underlying shared memory. (Maybe something lik estd::shared_ptr<bip::managed_shared_memory> ???) Commented Sep 11, 2024 at 10:35
  • 1
    Numpy arrays expect to own the memory they point to, either by direct ownership or by extending the owner's lifetime, they don't expect the memory to ever become invalid, you want to write your own array datatype to do what you want (having weak references to the buffer), otherwise you need to tie the manager's lifetime with the array's lifetime by setting it as the Base, thus extending the manager's lifetime. Commented Sep 11, 2024 at 10:41
  • @pptaszni to clarify: the problem would be fixed if I copied the shared memory to local memory before returning, but now I only want to do when the manager is destroyed. Commented Sep 11, 2024 at 12:38
  • @PepijnKramer I believe you mean a pointer to a pointer and change the middleman? I have tried something like that, but couldn’t get it to work. Commented Sep 11, 2024 at 12:39

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.