0

I need to find a way for a python process to figure out if it was launched as part of a multiprocessing pool.

I am using dask to parallelize calculations, using dask.distributed.LocalCluster. For UX reasons (this is used as part of a library for a specialized scientific task) i want the dask cluster setup to happen in a module that the user can import.

This means that i cannot use the usual guard:

import dask.distributed as dd

if __name__=='__main__':
    dd.LocalCluster()

to prevent child process from starting their own cluster, since I need to start the cluster from within a module that is itself imported.

By digging around with the psutil method, i was able to find out that the child processes are called with a --multiprocessing-fork command line option, and they run the multiprocessing.spawn.spawn_main method. I am thinking of checking for the presence of the --multiprocessing-fork flag to understand if the current process is part of the pool or not.

Is this the right approach? is there a better way? I could not find any obvious documentation on the multiprocessing.spawn.spawn_main method.

Thanks a lot!

1 Answer 1

-1

The simplest thing I can think of, is to see if distributed.worker.Worker._instances has any entries. Worker subprocesses should always have this. This is essentially what distributed.get_worker() does, which raises ValueError if not running on a worker.

Sign up to request clarification or add additional context in comments.

2 Comments

Given that _instances isn't part of the public API, I'd suggest that calling distributed.get_worker(), where except ValueError: handles "in a worker" case, and a subsequent else: handles the "not in a worker" case would be the better approach for stability.
Hey! I actually tried that approach first but it seems that distributed.get_worker() raises value error also in the worker. Probably because the import happens before dask is fully setup.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.