You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code currently does not use MPI_ABORT when there is an error on one of the processes, and there have been a few instances where MPI jobs have kept running after one process terminates. I think the solution to this is to call MPI_ABORT whenever an assertion/exception is thrown.
The text was updated successfully, but these errors were encountered:
create a new @assert macro and new exception types that internally call MPI_ABORT
Run the entire solver inside a try catch block that calls MPI_ABORT if an error is thrown
The problem with 1. is that exceptions thrown by code not written by us (for example, the DomainError thrown by the sqrt function in Base) won't call MPI_ABORT.
I don't like 2. because it has the potential to swallow errors that might be recoverable.
The code currently does not use
MPI_ABORT
when there is an error on one of the processes, and there have been a few instances where MPI jobs have kept running after one process terminates. I think the solution to this is to callMPI_ABORT
whenever an assertion/exception is thrown.The text was updated successfully, but these errors were encountered: