Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client observations creating too many uri tag values #4290

Open
sunruh opened this issue Jul 8, 2024 · 11 comments · May be fixed by #4296
Open

Client observations creating too many uri tag values #4290

sunruh opened this issue Jul 8, 2024 · 11 comments · May be fixed by #4296
Assignees

Comments

@sunruh
Copy link

sunruh commented Jul 8, 2024

After observability support was added to RestTemplateEurekaHttpClient in #4255 by using the RestTemplateBuilder metrics are being generated for Eureka client requests.

The ClientRequestObservationConvention adds the URI template of the requests as a low cardinality observability/metric key-value/tag.

The current implementations of EurekaHttpClient based on RestTemplate and WebClient assemble the full URL of the requsts and pass it to the client implementation. This causes a high number of values for the URI tag in the metrics, especially if the lastDirtyTimestamp changes often (due to e.g. flaky health of some component). By default only 100 values are allowed for the URI tag, additional values would be dropped and the metrics using these values are ignored.

Could be mitigated by passing at least the lastDirtyTimestamp as a uriVariable to the RestTemplate calls.

Version: spring-cloud-netflix-eureka-client 4.1.2

Sample of generated (prometheus) metrics

The following metrics where available after making the Eureka server unavailable twice and thus causing a status change to UNKNOWNand dirying of the InstanceInfo.

# HELP http_client_requests_seconds_max  
# TYPE http_client_requests_seconds_max gauge
http_client_requests_seconds_max{client_name="localhost",error="ResourceAccessException",exception="ResourceAccessException",method="GET",outcome="UNKNOWN",status="CLIENT_ERROR",uri="/eureka/apps/delta"} 0.0410048
http_client_requests_seconds_max{client_name="localhost",error="ResourceAccessException",exception="ResourceAccessException",method="PUT",outcome="UNKNOWN",status="CLIENT_ERROR",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720440338928"} 0.0
http_client_requests_seconds_max{client_name="localhost",error="ResourceAccessException",exception="ResourceAccessException",method="PUT",outcome="UNKNOWN",status="CLIENT_ERROR",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720441059188"} 0.0409643
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="GET",outcome="SUCCESS",status="200",uri="/eureka/apps/"} 0.0054987
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="GET",outcome="SUCCESS",status="200",uri="/eureka/apps/delta"} 0.0476062
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="POST",outcome="SUCCESS",status="204",uri="/eureka/apps/<appName>"} 0.0078605
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="CLIENT_ERROR",status="404",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720440338928"} 0.0
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="CLIENT_ERROR",status="404",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720441059188"} 0.0194635
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="SUCCESS",status="200",uri="/eureka/apps/<appName>/<instanceId>?status=DOWN&lastDirtyTimestamp=1720440308847"} 0.0
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="SUCCESS",status="200",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720440338928"} 0.0
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="SUCCESS",status="200",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720441059188"} 0.0066927
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="SUCCESS",status="200",uri="/eureka/apps/<appName>/<instanceId>?status=UP&lastDirtyTimestamp=1720442649723"} 0.0058572
@OlgaMaciaszek
Copy link
Collaborator

Hello @sunruh, thanks for reporting the issue. Makes sense. I don't think we want to modify Eureka's REST API at this point, but we may want to create our own implementation of ClientRequestObservationConvention that will disregard this query param and set it on the RestTemplate once it's been built.
@jonatan-ivanov wdyt?

@jonatan-ivanov
Copy link
Member

@sunruh
As a quick workaround till this is fixed and released, you can create a MeterFilter or an ObservationFilter @Bean in your application and remove the high cardinality part of the uri. The uri should be templated and low cardinality as you stated above.
I think if you can inject to your own RestTemplate to the Eureka client, then you should also be able to inject your own convention where you can modify this behavior (but I think the simplest thing to do is creating a MeterFilter).

@OlgaMaciaszek
I think creating a ~EurekaClientRequestObservationConvention that extends ClientRequestObservationConvention and normalizes the value of the uri so it will be low cardinality is the proper solution. I think I would also make this convention extendable (non-final) and injectable by the users so that users can define their own convention if they want to.

@OlgaMaciaszek OlgaMaciaszek added this to the 4.1.4 milestone Jul 17, 2024
@OlgaMaciaszek OlgaMaciaszek moved this to In Progress in 2023.0.4 Jul 17, 2024
@OlgaMaciaszek
Copy link
Collaborator

Thanks, @jonatan-ivanov, @sunruh. I've started working on it just to realise that this issue should already be resolved as a side effect of a different fix that I just merged yesterday. It switches from using exchange(String url, ...) to exchange(URI url, ...) in RestTemplateEurekaClient , hence removing the use of uriTemplate and the low cardinality uri key in favour of the high cardinality http.url key. Could you please verify against the latest snapshots?

@OlgaMaciaszek OlgaMaciaszek removed this from 2023.0.4 Jul 17, 2024
@OlgaMaciaszek OlgaMaciaszek removed this from the 4.1.4 milestone Jul 17, 2024
@sunruh
Copy link
Author

sunruh commented Jul 17, 2024

That fixes the issue. Of course all requests now end up with uri=UNKNOWN but it would be impossible to have an uri tag with any RestTemplate method that takes a URI as those are never templates:

# HELP http_client_requests_seconds_max  
# TYPE http_client_requests_seconds_max gauge
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="GET",outcome="SUCCESS",status="200",uri="none"} 0.0037342
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="POST",outcome="SUCCESS",status="204",uri="none"} 0.0
http_client_requests_seconds_max{client_name="localhost",error="none",exception="none",method="PUT",outcome="SUCCESS",status="200",uri="none"} 0.0073595

Both issues (this one and the one regarding special characters in e.g. the instance id) could have been solved by using templating in the RestTemplateEurekaHttpClient, not changing the API but using e.g. in getApplication(String appName), restTemplate.exchange(serviceUrl + "apps/{appName}", HttpMethod.GET, null, Application.class, appName) but I would consider that an improvement.

@OlgaMaciaszek
Copy link
Collaborator

@sunruh would you like to create a PR with the improvement?

@spring-cloud-issues
Copy link

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

sunruh pushed a commit to sunruh/spring-cloud-netflix that referenced this issue Jul 31, 2024
sunruh pushed a commit to sunruh/spring-cloud-netflix that referenced this issue Jul 31, 2024
@sunruh sunruh linked a pull request Jul 31, 2024 that will close this issue
@spring-cloud-issues
Copy link

Closing due to lack of requested feedback. If you would like us to look at this issue, please provide the requested information and we will re-open the issue.

@OlgaMaciaszek
Copy link
Collaborator

Reopening as enhancement.

@OlgaMaciaszek OlgaMaciaszek reopened this Aug 1, 2024
@OlgaMaciaszek OlgaMaciaszek removed their assignment Aug 1, 2024
@sunruh
Copy link
Author

sunruh commented Aug 13, 2024

Was my PR maybe missed as the status is now "help wanted"?

@OlgaMaciaszek
Copy link
Collaborator

Thanks @sunruh, I think the status was changed before your PR or roughly at the same time. Will look at it next week.

@Chessray
Copy link

Chessray commented Aug 16, 2024

Hi all, I stumbled on this issue just now; and I'm sorry if it is already resolved (quite frankly, it's a late Friday night for me, and I'm not fully able to follow the conversation anymore). In any case, while searching for solutions I stumbled on this article which presents a way of using WebClient that will still create a tag with the uriTemplate, but respect the parameters as such: https://medium.com/@nickjarzembowski_15160/controlling-webclient-metrics-cardinality-b96669dcc88d

Just thought it might be helpful in this situation. If this issue is already been taken care of or obsolete, please disregard this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants