I need to find a way for a python process to figure out if it was launched as part of a multiprocessing pool.
I am using dask to parallelize calculations, using dask.distributed.LocalCluster. For UX reasons (this is used as part of a library for a specialized scientific task) i want the dask cluster setup to happen in a module that the user can import.
This means that i cannot use the usual guard:
import dask.distributed as dd
if __name__=='__main__':
dd.LocalCluster()
to prevent child process from starting their own cluster, since I need to start the cluster from within a module that is itself imported.
By digging around with the psutil method, i was able to find out that the child processes are called with a --multiprocessing-fork command line option, and they run the multiprocessing.spawn.spawn_main method. I am thinking of checking for the presence of the --multiprocessing-fork flag to understand if the current process is part of the pool or not.
Is this the right approach? is there a better way? I could not find any obvious documentation on the multiprocessing.spawn.spawn_main method.
Thanks a lot!