Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle signals on separate thread #876

Closed
heifner opened this issue Oct 2, 2024 · 6 comments · Fixed by #993
Closed

Handle signals on separate thread #876

heifner opened this issue Oct 2, 2024 · 6 comments · Fixed by #993
Assignees
Labels
discussion enhancement New feature or request OCI Work exclusive to OCI team

Comments

@heifner
Copy link
Member

heifner commented Oct 2, 2024

Currently during application startup application_base::startup, startup_thread is used to handle signals until all plugin->startup() has completed. This allows for example, ctrl-c to interrupt replay of the blocklog during chain_plugin startup.

Once startup is complete, the signal handling is moved to the main thread. This means that a ctrl-c or SIGTERM is not processed until the application_base::exec() calls io_serv.poll() on the main thread. If the main thread is busy, for example applying a large range of blocks during syncing, then shutdown has to wait until the current task completes. It requires the main thread can call poll() to process the signal to set the is_quitting atomic flag.

Instead we could continue to use a seperate thread for signal processing (except for SIGHUP which we want on the main thread since it resets all the loggers). This thread then can immediately set the is_quitting flag allowing the application to immediately exit.

@enf-ci-bot enf-ci-bot moved this to Todo in Team Backlog Oct 2, 2024
@bhazzard bhazzard added enhancement New feature or request 👍 lgtm and removed triage labels Oct 3, 2024
@bhazzard bhazzard added this to the Spring v1.1.0 Cusp milestone Oct 3, 2024
@bhazzard
Copy link

bhazzard commented Oct 3, 2024

This introduces another way to signal an event outside of the existing priority queue mechanism for a special case (shutting down). We may want to consider a more unified approach.

@heifner
Copy link
Member Author

heifner commented Oct 7, 2024

I missed the discussion on this, so maybe I'm not understanding the issue. What I'm suggesting we do for this issue is to have a seperate thread (just use the thread we are already using on startup) who's sole job is to call app().quit(). This approach is outside the priority queue. All extra threads are outside the existing priority queue as it is used for the main thread (and read-only threads). I don't see any reason to coordinate this with the main thread.

@heifner
Copy link
Member Author

heifner commented Oct 7, 2024

In addition, the user experience is so bad with the current implementation, I would argue this is a bug. Leap 5.0 does not require 10 minutes to shutdown when a user hits ctrl-c.

@greg7mdp
Copy link
Contributor

greg7mdp commented Oct 8, 2024

There were multiple main discussion topics that I recall:

  • whether this should be done separately or as as part of cleanup and define consistent pattern for handling nodeos plugin shutdown #842.
    I suggested that and mentioned that the fact that we do currently create the separate thread to catch signals during the plugin initialization, but then switch to catching signals in the main thread possibly had a rationale that we forgot about, warranting maybe some extra thought before reverting to use the thread throughout.
  • maybe catching signals on the main thread was on purpose, because we never wanted the main thread to be stuck on a task for a long time. It made sense when calls such as get_info or get_table_rows depended on the main thread being responsive. If this is not the case anymore, maybe it is acceptable to have the main thread busy for some time and just check for quit, but not process other messages.
  • however having the main thread busy for a long time and just checking for quit prevents it from processing other messages from the queue, including some which may be of a higher priority. So maybe we should have a mechanism which has the main thread checking the queue occasionally, instead of just checking for the quit boolean.

After a 1/2 hour discussion, it was decided that we needed to talk about this more with you (@heifner) present. Might very well be that the fix you suggest solves the immediate issue and should be implemented now, even if a more comprehensive solution is implemented later.

PS: Anyone, feel free to correct or add to what I recollected above.

@greg7mdp greg7mdp changed the title Handle signals on seperate thread Handle signals on separate thread Oct 18, 2024
@greg7mdp
Copy link
Contributor

@heifner should we close this issue?

@heifner
Copy link
Member Author

heifner commented Oct 30, 2024

@heifner should we close this issue?

No, needed for #985

@heifner heifner self-assigned this Nov 1, 2024
@heifner heifner added the OCI Work exclusive to OCI team label Nov 1, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in Team Backlog Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement New feature or request OCI Work exclusive to OCI team
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants