Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added UserReplicaGroupMetrics #6463

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

7h3-3mp7y-m4n
Copy link

What this PR does:
The main purpose of this pr is to add a goroutine to update tenant HA replica groups to metrics periodically.
Which issue(s) this PR fixes:
Fixes #6397

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: 7h3-3mp7y-m4n <[email protected]>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. It looks great so far. Some suggestions

userReplicaGroupCount: promauto.With(reg).NewGaugeVec(prometheus.GaugeOpts{
Name: "ha_tracker_user_replica_group_count",
Help: "Number of HA replica groups tracked for each user.",
}, []string{"user"}),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to clean up the metric in CleanupHATrackerMetricsForUser so that when a user is not active anymore the corresponding user dimension is removed from the metric

time.Sleep(10 * time.Second)
t.updateUserReplicaGroupCount()
}
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to the loop method?
You can follow cleanupOldReplicasLoop and do it similarly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay :)

@@ -212,6 +218,12 @@ func NewHATracker(cfg HATrackerConfig, limits HATrackerLimits, trackerStatusConf
}
t.client = client
}
go func() {
for {
time.Sleep(10 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do 30s interval and make the interval a const

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sure! :)

`
require.NoError(t, testutil.GatherAndCompare(reg, strings.NewReader(expectedMetrics), "ha_tracker_user_replica_group_count"))
delete(tracker.replicaGroups, "user1")
tracker.userReplicaGroupCount.WithLabelValues("user1").Set(0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need set this manually.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay!

ha_tracker_user_replica_group_count{user="user2"} 3
ha_tracker_user_replica_group_count{user="user1"} 0
`
require.NoError(t, testutil.GatherAndCompare(reg, strings.NewReader(expectedMetricsAfterCleanup), "ha_tracker_user_replica_group_count"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add CleanupHATrackerMetricsForUser for one test user and make sure ha_tracker_user_replica_group_count is cleaned up?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah why not :)

@7h3-3mp7y-m4n
Copy link
Author

hey @yeya24 , I’m currently working on the tests and ran into an issue. Whenever I use CleanupHATrackerMetricsForUse for cleanup, I get a nil pointer error. This might take me some time to resolve, so I’d really appreciate any thoughts or insights you might have on this!

@yeya24
Copy link
Contributor

yeya24 commented Jan 3, 2025

Hey @7h3-3mp7y-m4n, can you please share the error you got? And your test code if possible.

I checked out your code locally and tried to update the metrics clean up test. yeya24@9a35c7d
I didn't see any null pointer issue so maybe I missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HA Tracker: Expose per tenant metric of number of HA replica groups
2 participants