Environmental Variables¶
CHAINERMN_FORCE_ABORT_ON_EXCEPTIONS
If this variable is set to a non-empty value, ChainerMN installs a global hook to Python’s sys.excepthook to call
MPI_Abort()
when an unhandled exception occurs. See MPI process hangs after an unhandled Python exception.ChainerMN issue #236 may also help to understand the problem.
Execution Control¶
- chainermn.global_except_hook.add_hook()¶
Add a global hook function that captures all unhandled exceptions.
The function calls MPI_Abort() to force all processes abort. It is useful when you run your training script on a cloud platform.