Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a friendly panic hook #45

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add a friendly panic hook #45

wants to merge 3 commits into from

Conversation

Quantumplation
Copy link

A block producing node, and Amaru, is considered "mission critical" software, for the following reasons:

  • The network is relying on the availability and correctness of each alternative implementation in proportion to its stake
  • The stake pool operators are (i.e. eventually will be) relying on the continuing functioning of the block producer as a form of income

For that reason, the node should never just hard "crash"; even in error cases, the node should respond to these bad conditions in a more graceful way, perhaps eventually even including raising an alarm via any metrics / alerting systems built into the node.

Therefore, we install a panic hook; if a crash does arise in the wild, we'd love a bug report, so we can address it more appropriately. For that reason, we want to make the process to open the issue as welcoming as possible.

Here's what it looks like for an explicit panic:
image

Here's what it looks like for an implicit panic, like divide by zero:
image

Here's what it looks like for a contextual panic, like .expect():
image

I included a commit in the history with an explicit panic; if you want to test it yourself, you can checkout e9bfdf5931ff6b093e830bf0252c6be29eaa7372 and run cargo run daemon --peer-address 127.0.0.1.

Note that I borrowed a lot of code from Aiken; For this PR, I just included the needed utility functions in the panic.rs file, but we might want to pull some of these into their own files / modules, as they are in Aiken.

Given that a block producing node is considered mission critical, a panic should almost certainly be considered a bug, as the node should catch and respond to any errors in a more appropriate way. If we ever *do* panic, we show a nice message, and ask the user to open a ticket
For code reviewers to test if they'd like, just `cargo run daemon --peer-address 127.0.0.1`; will remove in the next commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant