-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose info on Probe status #36
Comments
I'm not sure if its possible to do what you want on just Only "probe" metric I can find from my servers is
I use this to provide the I don't think the threshold and window can be known from varnishstat. Do you know where I could query those values in a programmatic way? From this varnish code, it looks like the 8 window is hard coded and cant be changed. https://github.com/varnishcache/varnish-cache/blob/master/bin/varnishstat/varnishstat_curses.c#L680 (code that renders varnishstat stdout visualization). We could easily then do loop of 8 and provide at least |
No, sadly the info isn't exposed by |
(Which I realise is a horrible format to try and parse, because it's meant to be more human readable – and thus would be quite fragile because it could die if any Varnish update that changes the format of this text) |
For a illustration of the graph that I would love to be able to generate would be something like the following: Which would indicate a short-lived error (lasted for 4 scrape-cycles). It would be a combination of two queries, maybe like
|
Hmm, looks like its not hardcoded to 8. I guess that is the max window and most that particular tool will render. I get this for my prod varnish.
Its quite horrible to parse but one could do a simple regexp for that. But if it can change per server, the it will require more robust logic to parse it for each server. Doing that with indentation or something is quite fragile and could change between varnish versions. |
It would be very error-prone and fragile to parse the human-readable text, yes. I wonder what would be the most resilient way, though. Maybe assuming that the good, threshold, window counts are always on line 1 of each block, and that the avg. response-time is always on line two. If that's the assumption, then a reg.ex. like As for the avg. response line, it would probably just be looking at the tail-end of the second line As for identifying the backends that have probes (not all backends do – see the first line in my example output in the issue), it would be a question of finding a block of "line starting at char 0, then indented lines and then a blank line" or something along those lines. For the sake of having some more test-data, here's the full output of |
Varnish+ 4.1
1.4.1
I've been building up a dashboard based on the Prometheus metrics exposed by this exporter. One of my goals was to try and make the dashboard that the Varnish Agent provides redundant. So far I've been able to replicate everything but one: Healthcheck probe status
That is, for each backend have the number of successful probes, the threshold for when the service will be marked unhealthy, and the total window size.
In the dashboard from Varnish Agent, you can see the status for each backend, e.g. "Healthy 8/8".
The reason why this isn't possible to gather this using this exporter is that the info isn't exposed by
varnishstat
– instead that info needs to be retrieved usingvarnishadm
.Some example output from running
varnishadm backend.list -p
(backends defined by the goto-director)varnishadm -n [ident] backend.list -p Backend name Admin Probe boot.dummy probe Healthy (no probe) boot.goto.00000000.(XX.YY.ZZ.WW).(http://service.name:80) probe Healthy 8/8 Current states good: 8 threshold: 3 window: 8 Average response time of good probes: 0.073743 Oldest ================================================== Newest 4444444444444444444444444444444444444444444444444444444444444444 Good IPv4 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX Good Xmit RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR Good Recv HHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH Happy
Ideally, this could be exposed in a handful of new metrics, like
I've never touched Go before, but if this type of information is desired by others, I wouldn't mind trying to create a PR for scraping
varnishadm
to expose these metrics...Kind regards
Morten
The text was updated successfully, but these errors were encountered: