Skip to content Skip to sidebar Skip to footer

Sharing Synchronization Objects Through Global Namespace Vs As A Function Argument

If I need to share a multiprocessing.Queue or a multiprocessing.Manager (or any of the other synchronization primitives), is there any difference in doing it by defining them at th

Solution 1:

As mentioned in the programming guidelines

Explicitly pass resources to child processes

On Unix using the fork start method, a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.

Apart from making the code (potentially) compatible with Windows and the other start methods this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process. This might be important if some resource is freed when the object is garbage collected in the parent process.

The issue is the way the spawn/forkserver (Windows only supports spawn) works under the hood. Instead of cloning the parent process with its memory and files desciptors, it creates a new process from the ground. It then loads a new Python interpreter passing the modules to import and launches it. This obviously means your global variable will be a brand new Queue instead of the parent's one.

Another implication is that the objects you want to pass to the new process must be pickleable as they will be passed through a pipe.

Solution 2:

Just summarizing the answer from Davin Potts:

The only portable solution is to share Queue() and Manager().* objects by passing them as arguments - never as global variables. The reason is that on Windows all the global variables will be re-created (rather than copied) by literally running module the code from the beginning (very little information is actually copied from the parent process to the child process); so a brand new Queue() would be created and of course (without some undesirable and confusing magic) it can't possibly be connected to the Queue() in the parent process.

My understanding is that there is no disadvantage to passing Queue(), etc. as parameters; I can't find any reason why anyone would want to use a non-portable solution with global variables, but of course I may be wrong.

Post a Comment for "Sharing Synchronization Objects Through Global Namespace Vs As A Function Argument"