Skip to content Skip to sidebar Skip to footer

Why Does Importing Module In '__main__' Not Allow Multiprocessig To Use Module?

I've already solved my problem by moving the import to the top declarations, but it left me wondering: Why cant I use a module that was imported in '__main__' in functions that are

Solution 1:

The situation is different in unix-like systems and Windows. On the unixy systems, multiprocessing uses fork to create child processes that share a copy-on-write view of the parent memory space. The child sees the imports from the parent, including anything the parent imported under if __name__ == "__main__":.

On windows, there is no fork, a new process has to be executed. But simply rerunning the parent process doesn't work - it would run the whole program again. Instead, multiprocessing runs its own python program that imports the parent main script and then pickles/unpickles a view of the parent object space that is, hopefully, sufficient for the child process.

That program is the __main__ for the child process and the __main__ of the parent script doesn't run. The main script was just imported like any other module. The reason is simple: running the parent __main__ would just run the full parent program again, which mp must avoid.

Here is a test to show what is going on. A main module called testmp.py and a second module test2.py that is imported by the first.

testmp.py

import os
import multiprocessing as mp

print("importing test2")
import test2

defworker():
    print('worker pid: {}, module name: {}, file name: {}'.format(os.getpid(), 
        __name__, __file__))

if __name__ == "__main__":
    print('main pid: {}, module name: {}, file name: {}'.format(os.getpid(), 
        __name__, __file__))
    print("running process")
    proc = mp.Process(target=worker)
    proc.start()
    proc.join()

test2.py

import osprint('test2 pid: {}, module name: {}, file name: {}'.format(os.getpid(),
        __name__, __file__))

When run on Linux, test2 is imported once and the worker runs in the main module.

importingtest2test2 pid:17840,module name:test2,file name:/media/td/USB20FD/tmp/test2.pymain pid:17840,module name:__main__,file name:testmp.pyrunningprocessworker pid:17841,module name:__main__,file name:testmp.py

Under windows, notice that "importing test2" is printed twice - testmp.py was run two times. But "main pid" was only printed once - its __main__ wasn't run. That's because multiprocessing changed the module name to __mp_main__ during import.

E:\tmp>pytestmp.pyimportingtest2test2 pid:7536,module name:test2,file name:E:\tmp\test2.pymain pid:7536,module name:__main__,file name:testmp.pyrunningprocessimportingtest2test2 pid:7544,module name:test2,file name:E:\tmp\test2.pyworker pid:7544,module name:__mp_main__,file name:E:\tmp\testmp.py

Post a Comment for "Why Does Importing Module In '__main__' Not Allow Multiprocessig To Use Module?"