I want to perform updates on a file via multiple processes in parallel. These processes all open this file for write in parallel.
Abbreviations used:
f: file,p[i]: processi,b[i]: buffer for FDiopened by processi.
Questions:
- When a file is opened and a stream is established, does the
fpathinternally translate to inode number? I read inode is unique only across a partition. - When same file is opened in parallel, how does linux manage writes?
- If
b[1]is full, it will flush. Does this mean allp[i]will start to see changes in file? This does not happen. So where are contents of the buffer flushed? If COW happens, does this mean linux creates a copy of the dirty page on disk? Or does something like MVCC? (I am assuming instead of copying all pages only dirty pages get rewritten since otherwise modifying a huge file would be troublesome) - As an experiment, I opened a file using
vieditor. I deleted the file using terminal and in editor, added some text to file and saved. File had been recreated. In another case when file was not edited, it did not exist any longer when I closed it in editor. Seems like COW is in working. But since file was deleted in 2nd case, did COW use in-memory pages of file to recreate the file? What if file was 10GB in size and unable to fit in memory at once?
fopened inP[1]and edit it at location 101 whileP[2]modifiesfat 100, add 10 bytes of data and flushes beforeP[1], would my changes at location 101 byP[1]still appear at location 101 or would my text have moved to location 111? In unbuffered, changes will be seen more quickly by other processes.