Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruler does not consistently restore for state #6465

Open
rajagopalanand opened this issue Dec 29, 2024 · 0 comments
Open

Ruler does not consistently restore for state #6465

rajagopalanand opened this issue Dec 29, 2024 · 0 comments
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc.

Comments

@rajagopalanand
Copy link
Contributor

rajagopalanand commented Dec 29, 2024

Description

Currently Prometheus rule manager only restores for state of rule groups after restarts. This is fine for Prometheus. However, in Cortex, rule groups can jump from one ruler instance (r1) to another (r2) due to resharding. If r2 happens to be evaluating rule groups for that tenant already, then the manager will not restore the for state and will result in alerts going into an incorrect state. For example, an alert can go from FIRING to PENDING

To Reproduce

  1. Create rules for a tenant with shard size > 1. For ease of testing, all the ruler instances were running rules for the tenant
  2. Wait for alerting rule to go into FIRING
  3. Restart the instance that was evaluating the alerting rule. Here the assumption is the ruler takes a bit to restart giving another ruler a chance to evaluate the alerting rule at least once
  4. The alerting rule will go to PENDING

Expected behavior

  • The alert rule should stay in FIRING state

Additional Context

There is a PR open for Prometheus to address this issue. Without the PR approved, it is difficult to fix this issue

@rajagopalanand rajagopalanand changed the title Ruler do not consistently restore for state Ruler does not consistently restore for state Dec 29, 2024
@dosubot dosubot bot added the component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc. label Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/rules Bits & bobs todo with rules and alerts: the ruler, config service etc.
Projects
None yet
Development

No branches or pull requests

1 participant