Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI_ABORT Usage #137

Open
JaredCrean2 opened this issue Apr 19, 2018 · 1 comment
Open

MPI_ABORT Usage #137

JaredCrean2 opened this issue Apr 19, 2018 · 1 comment

Comments

@JaredCrean2
Copy link
Contributor

The code currently does not use MPI_ABORT when there is an error on one of the processes, and there have been a few instances where MPI jobs have kept running after one process terminates. I think the solution to this is to call MPI_ABORT whenever an assertion/exception is thrown.

@JaredCrean2
Copy link
Contributor Author

I can see two ways of doing this:

  1. create a new @assert macro and new exception types that internally call MPI_ABORT
  2. Run the entire solver inside a try catch block that calls MPI_ABORT if an error is thrown

The problem with 1. is that exceptions thrown by code not written by us (for example, the DomainError thrown by the sqrt function in Base) won't call MPI_ABORT.

I don't like 2. because it has the potential to swallow errors that might be recoverable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant