How do I use a file like a memory buffer in Python?

Question

I don't know the correct terminology, maybe it's called page file, but I'm not sure. I need a way to use an on-disk file as a buffer, like bytearray. It should be able to do things like a = buffer[100:200] and buffer[33] = 127 without the code having to be aware that it's reading from and writing to a file in the background.

Basically I need the opposite of bytesIO, which uses memory with a file interface. I need a way to use a file with a memory buffer interface. And ideally it doesn't write to the file everytime the data is changed (but it's ok if it does).

The reason I need this functionality is because I use packages that expect data to be in a buffer object, but I only have 4MB of memory available. It's impossible to load the files into memory. So I need an object that acts like a bytearray for example, but reads and writes data directly to a file, not memory.

In my use case I need a micropython module, but a standard python module might work as well. Are there any modules that would do what I need?

You might need to use some low level file.seeking for that — jvx8ss
– jvx8ss, Commented Nov 30, 2022 at 19:36
@jvx8ss "without the code having to be aware that it's reading from and writing to a file in the background" "The reason I need this functionality is because I use packages that expect data to be in a buffer object" — uzumaki
– uzumaki, Commented Nov 30, 2022 at 20:20
@0x0fba mmap does not exist for micropython, so it's not an option. Also it does not do what I need. It copies the full mapping into memory. It's not possible to map a 100MB file with mmap but only use 1MB of cache memory and it's an OS-specific functionality, not cpython. — uzumaki
– uzumaki, Commented Nov 30, 2022 at 20:25
@uzumaki Maybe if you make a class that internally uses file.seek and use __getitem__ and __setitem__ to achieve the buffer[100:200], buffer[33] = 127 that you want? — jvx8ss
– jvx8ss, Commented Nov 30, 2022 at 20:42

jvx8ss · Accepted Answer · 2022-11-30 22:29:21Z

1

Can something like this work for you?

class Memfile:

    def __init__(self, file):
        self.file = file

    def __getitem__(self,key):
        if type(key) is int:
            self.file.seek(key)
            return self.file.read(1)
        if type(key) is slice:
            self.file.seek(key.start)
            return self.file.read(key.stop - key.start)

    def __setitem__(self, key, val):
        assert(type(val) == bytes or type(val) == bytearray)
        if type(key) is slice:
            assert(key.stop - key.start == len(val))
            self.file.seek(key.start)
            self.file.write(val)
        if type(key) is int:
            assert(len(val) == 1)
            self.file.seek(key)
            self.file.write(val)

    def close(self):
        self.file.close()


if __name__ == "__main__":
    mf = Memfile(open("data", "r+b")) # Assuming the file 'data' have 10+ bytes
    mf[0:10] = b'\x00'*10
    print(mf[0:10]) # b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
    mf[0:2] = b'\xff\xff'
    print(mf[0:10]) # b'\xff\xff\x00\x00\x00\x00\x00\x00\x00\x00'
    print(mf[2]) # b'\x00'
    print(mf[1]) # b'\xff'
    mf[0:4] = b'\xde\xad\xbe\xef'
    print(mf[0:4]) # b'\xde\xad\xbe\xef'
    mf.close()

Note that if this solutions fits your needs you will need to do plenty of testing here

edited Nov 30, 2022 at 22:29

answered Nov 30, 2022 at 21:24

jvx8ss

6113 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

uzumaki Over a year ago

Yes, this would be the basic implementation of what I need. Is there a caching package that could be patched into the class? One that would read from the cache if possible and flush the cache to disk when it's full. Also shouldn't it be self.file.read(key.stop - key.start) ?

jvx8ss Over a year ago

Your right, it should self.file.read(key.stop - key.start). I don't know about the caching tough, sorry

Collectives™ on Stack Overflow

How do I use a file like a memory buffer in Python?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related