Why Does Importing Module In '__main__' Not Allow Multiprocessig To Use Module?
Solution 1:
The situation is different in unix-like systems and Windows. On the unixy systems, multiprocessing
uses fork
to create child processes that share a copy-on-write view of the parent memory space. The child sees the imports from the parent, including anything the parent imported under if __name__ == "__main__":
.
On windows, there is no fork, a new process has to be executed. But simply rerunning the parent process doesn't work - it would run the whole program again. Instead, multiprocessing
runs its own python program that imports the parent main script and then pickles/unpickles a view of the parent object space that is, hopefully, sufficient for the child process.
That program is the __main__
for the child process and the __main__
of the parent script doesn't run. The main script was just imported like any other module. The reason is simple: running the parent __main__
would just run the full parent program again, which mp
must avoid.
Here is a test to show what is going on. A main module called testmp.py
and a second module test2.py
that is imported by the first.
testmp.py
import os
import multiprocessing as mp
print("importing test2")
import test2
defworker():
print('worker pid: {}, module name: {}, file name: {}'.format(os.getpid(),
__name__, __file__))
if __name__ == "__main__":
print('main pid: {}, module name: {}, file name: {}'.format(os.getpid(),
__name__, __file__))
print("running process")
proc = mp.Process(target=worker)
proc.start()
proc.join()
test2.py
import osprint('test2 pid: {}, module name: {}, file name: {}'.format(os.getpid(),
__name__, __file__))
When run on Linux, test2 is imported once and the worker runs in the main module.
importingtest2test2 pid:17840,module name:test2,file name:/media/td/USB20FD/tmp/test2.pymain pid:17840,module name:__main__,file name:testmp.pyrunningprocessworker pid:17841,module name:__main__,file name:testmp.py
Under windows, notice that "importing test2" is printed twice - testmp.py was run two times. But "main pid" was only printed once - its __main__
wasn't run. That's because multiprocessing
changed the module name to __mp_main__
during import.
E:\tmp>pytestmp.pyimportingtest2test2 pid:7536,module name:test2,file name:E:\tmp\test2.pymain pid:7536,module name:__main__,file name:testmp.pyrunningprocessimportingtest2test2 pid:7544,module name:test2,file name:E:\tmp\test2.pyworker pid:7544,module name:__mp_main__,file name:E:\tmp\testmp.py
Post a Comment for "Why Does Importing Module In '__main__' Not Allow Multiprocessig To Use Module?"