1

I am running Python with MPI on a supercomputing cluster. I am getting strange nondeterministic behavior that I think is a result of I/O complications that are not present on the single machines I'm used to working with.

One of the things my code does is to create directories using os.makedirs somewhat frequently. I know also that I generally should not write small amounts of data to the filesystem-- this can end up with the data getting stuck in some buffer and not written for a long time. I suspect this may be happening with my directory creation calls, and then later code tries to write to files inside the directory before it exists. Two questions:

  • is creating a new directory effectively the same thing as writing a small amount of data?
  • When forcing data to be written, I use flush and os.fsync. These require a file object. Is there an equivalent to make sure the directory has been created?
1
  • while not os.path.isdir(target): # wait or retry? Commented Jul 29, 2014 at 21:35

1 Answer 1

1

Creating a new directory is effectively the same as writing small amount of data. It adds an inode.

The only way mkdir (or os.mkdirs) should fail is if the directory exists - otherwise the directory will always be created. In terms of the data being buffered - it's unlikely that this would happen - even journaled filesystems will sync out pretty regularly.

If you're having non-deterministic behavior, just wrap your directory creation / writing a file into that directory inside a try / except / finally that makes a few efforts? But really - the need for such code hints at something much more sinister and is likely a bigger issue.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.