-
Notifications
You must be signed in to change notification settings - Fork 617
Metrics refactor
This document describes the fabio metrics layer and documents the transition from the go-metrics based layer to a more flexible approach. Once that transition has been completed the documentation for the transition will be removed.
Fabio metrics started out with an implementation of the go-metrics library mostly for Graphite since that's what we were using at eCG. This became somewhat more flexible over time but the design doesn't make it easy to add providers like Dogstatd, Prometheus and others which support tagged metrics.
Also, the go-metrics library aggregates histograms internally which does not work well with providers like statsd and Circonus which do the histogram aggregation on the server. Fabio does not support multiple metrics providers simultaneously which makes migration between metrics systems difficult. And last but not least the go-metrics library hasn't seen significant updates in over a year. The last commit is from 28 Nov 2016.
Fabio currently supports Graphite, statsd, Circonus and stdout for debugging.
The metric names fall into two groups: service metrics and internal metrics.
Service metric names are generated with the template defined in metrics.names
which is by default <service>.<domain>.<path>.<host:port>
Internal metric names are hard-coded like http.status.code.200
or notfound
.
All metrics names can have a prefix which can be configured through a template
defined in metrics.prefix
and which defaults to <hostname>.<exec name>
.
The Graphite and statsd providers provide aggregated histograms whereas the Circonus provider sends events to the server.
There are several issues open for additional providers:
- Riemann
- DogstatD with tag support
- Prometheus
- InfluxDB
- Cloudwatch
- StatsD without histogram aggregation
Fabio currently provides the following metrics:
Depending on the metrics provider the timer aggregation happens either in the metrics library (go-metrics: statsd, graphite) or in the system of the metrics provider (Circonus)
Name | Type | Description |
---|---|---|
http.status.code.${stauts_code} |
timer | aggregation over all http requests per status code |
notfound |
counter | counts all http route lookup failures |
requests |
timer | aggregation of all http requests |
ws.conn |
gauge | current number of open web socket connections |
tcp.conn |
counter | counts the number of successful TCP proxy connections |
tcp.connfail |
counter | counts the number of failed TCP connections |
tcp.noroute |
counter | counts the number of TCP route lookup failures |
tcp_sni.conn |
counter | counts the number of successful TCP+SNI proxy connections |
tcp_sni.connfail |
counter | counts the number of failed TCP+SNI connections |
tcp_sni.noroute |
counter | counts the number of TCP+SNI route lookup failures |
{{ metrics.name }} |
timer | |
{{ metrics.name }}.rx |
counter | |
{{ metrics.name }}.tx |
counter |
- timer - counts events and provides an average throughput and latency number
- counter - counts events and provides an monotonically increasing value
- gauge - current value
A new metrics layer must be flexible enough support aggregation in process or on the server. It needs to support flat namespaces and tags and it needs to be compatible with existing fabio installations.
These metrics libraries are in use by other projects:
armon/go-metrics
supports circonus, graphite, statsd, statsite, datadog and
prometheus.
go-kit/kit/metrics
supports cloudwatch, dogstatd, expvar, graphite, influx,
pcp, prometheus, statsd. Circonus was supported but later removed because of
flaky tests.
go-kit/kit/metrics
is the best fit for what fabio provides today and what
users want. Existing go-metrics implementations could be written as legacy
drivers, if necessary.
The problem that go-kit
does not solve however is the name generation for the
different metrics providers. Providers like Graphite and statsd which do not
support tags need a flat name space with the tag values coded into the name of
the metric. Tagged providers can have more generic names and provide additional
names as tags. Then we also need to support the existing legacy metric names.
Fabio could make these names configurable with sensible defaults for each provider. However, this would add quite a number of config options which would almost never be changed. Also, we need to decide which attributes should be tagged and which should be part of the name and whether those attributes should be configurable at all or even for each provider.
Metrics names could be evaluated at runtime, e.g. through the Go template engine. However, we would need to determine the alloc overhead for this evaluation since this code is in the hot path and is executed a lot.
Since providers are either tagged or not tagged we could provide two names for each metric and depending on which provider is used we use either the one or the other.
Legacy Name | Flat name | Tagged name |
---|---|---|
http.status.code.${stauts_code} |
http.status.code.${status_code} |
http.status code:${status_code} |
notfound |
http.noroute |
http.noroute |
requests |
http.requests |
http.requests |
ws.conn |
ws.conn |
ws.conn |
tcp.conn |
tcp.conn |
tcp.conn |
tcp.connfail |
tcp.connfail |
tcp.connfail |
tcp.noroute |
tcp.noroute |
tcp.noroute |
tcp_sni.conn |
tcp_sni.conn |
tcp_sni.conn |
tcp_sni.connfail |
tcp_sni.connfail |
tcp_sni.connfail |
tcp_sni.noroute |
tcp_sni.noorute |
tcp_sni.noroute |
{{ metrics.name }} |
{{ metrics.name }} |
{{ metrics.tagged_name }} service:<svc> host:<host:port> |
{{ metrics.name }}.rx |
{{ metrics.name }}.rx |
{{ metrics.tagged_name }}.rx service:<svc> host:<host:port> |
{{ metrics.name }}.tx |
{{ metrics.name }}.tx |
{{ metrics.tagged_name }}.tx service:<svc> host:<host:port> |
- Home
- Quickstart
- Installation
- Verifying Releases
- Configuration
- Binding to low ports
- Deployment
-
Features
- Access Logging ⭐️
- Certificate Stores
- Compression
- Circonus Support
- DataDog Support
- Docker Support
- Dynamic Reloading
- Graceful Shutdown
- Graphite Support
- HTTP Header
- HTTPS Upstream
- Metrics Support
- Path Stripping
- PROXY Protocol
- Request Debugging
- Request Tracing
- SSE Support
- StatsD Support
- TCP Proxy ⭐️
- TCP+SNI Support
- Traffic Shaping
- Vault Integration
- Websockets
- Web UI
- Performance
- Service Configuration
- Routing
- Debugging
- Contributing
- Why fabio?