Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cortex 1.13.0 Ingester Readiness probe failed; context dead line exceeded (Client.Timeout exceeded while awaiting headers) #4842

Closed
vk0125 opened this issue Aug 24, 2022 · 1 comment

Comments

@vk0125
Copy link

vk0125 commented Aug 24, 2022

Describe the bug
Cortex 1.13.0 Ingester Readiness probe failed; context dead line exceeded (Client.Timeout exceeded while awaiting headers)

To Reproduce
Steps to reproduce the behavior:

  1. Use cortex-helm-chart
  2. Disable all the components except Nginx, Distributor, and Ingester
  3. Give minimal resources/limits for K8s pods
  4. Ingester can be implemented as deployment/StatefulSet

Expected behavior

  • Ingester must be able to register itself with memberlist ring.
  • Ingester should be able to write data in Long-term storage which in my case is Azure Blob Storage Account.

Environment:

  • Infrastructure: Kubernetes [1.23.5] AKS
  • Deployment tool: helm
  • Source: cortex-helm-chart

Additional Context

Config Passed:

alertmanager:
  enabled: false
ruler:
  enabled: false
query_scheduler:
  enabled: false
querier:
  enabled: false
query_frontend:
  enabled: false
overrides_exporter:
  enabled: false
store_gateway:
  enabled: false
compactor:
  enabled: false

config:
  auth_enabled: false
  storage:
    engine: blocks
  blocks_storage:
    backend: azure
    azure:
      account_name: "****"
      account_key: "****"
      container_name: "cortex"
      endpoint_suffix: "core.windows.net"
    tsdb:
      dir: /tmp/cortex/tsdb
    bucket_store:
      sync_dir: /tmp/cortex/tsdb-sync
  server:
    http_listen_port: 8080
    grpc_server_max_recv_msg_size: 104857600
    grpc_server_max_send_msg_size: 104857600
    grpc_server_max_concurrent_streams: 1000
  ingester:
    lifecycler:
      join_after: 0s
      min_ready_duration: 0s
      final_sleep: 0s
  ingester_client:
    grpc_client_config:
      max_recv_msg_size: 104857600
      max_send_msg_size: 104857600
      grpc_compression: gzip
  distributor:
    ring:
      kvstore:
        store: memberlist
nginx:
  service:
    type: LoadBalancer
    annotations:
      service.beta.kubernetes.io/azure-load-balancer-internal: "true"
  autoscaling:
    enabled: true
    minReplicas: 1
    minreplicas: 2
  resources:
    limits:
      memory: 512M
      cpu: 0.5
    requests:
      memory: 256M
      cpu: 0.2

ingester:
  statefulSet:
    enabled: true
  resources:
    requests:
      memory: 2G
      cpu: 1
    limits:
      memory: 2G
      cpu: 1
      
distributor:
  resources:
    limits:
      memory: 512M
      cpu: 0.5
    requests:
      memory: 256M
      cpu: 0.2

Distributor Logs:

cortexuser@cortexvm:~/cortex-helm-chart-master$ kubectl logs cortex-distributor-7d64cc7447-g2zpz   -n cortex
level=info caller=main.go:193 msg="Starting Cortex" version="(version=1.13.0, branch=HEAD, revision=69
139ac)"    
level=info caller=server.go:260 http=[::]:8080 grpc=[::]:9095 msg="server listening on addresses"
level=info caller=memberlist_client.go:395 msg="Using memberlist cluster node name" name=cortex-distri
butor-7d64c
level=info caller=module_service.go:64 msg=initialising module=server
level=info caller=module_service.go:64 msg=initialising module=memberlist-kv
level=info caller=module_service.go:64 msg=initialising module=runtime-config
level=info caller=module_service.go:64 msg=initialising module=ring
level=info caller=ring.go:269 msg="ring doesn't exist in KV store yet"
level=info caller=module_service.go:64 msg=initialising module=distributor-service
level=info caller=cortex.go:436 msg="Cortex started"
level=info caller=memberlist_client.go:523 msg="joined memberlist cluster" reached_nodes=1
caller=memberlist_logger.go:74 level=warn msg="Failed to resolve cortex-memberlist: lookup cortex-memberlist on 10.251.0.10:53: no such host"

Ingester Logs:

level=info caller=main.go:193 msg="Starting Cortex" version="(version=1.13.0, branch=HEAD, revision=69
139ac)"
level=info caller=server.go:260 http=[::]:8080 grpc=[::]:8085 msg="server listening on addresses"

Ingester Describe pod details:

  Type          Reason          Age                From               Message
  ----            ------            ----                   ----               -------
Normal  Scheduled 27m                            default-scheduler  Successfully assigned cortex/cortex-ingester-0
Warning BackOff   26m (x10 over 27m)     kubelet    Back-off restarting failed container
Normal  Pulled      26m (x5 over 27m)      kubelet     Container image "quay.io/cortexproject/cortex:v1.13.0" already present on machine
Normal  Created   26m (x5 over 27m)      kubelet     Created container ingester
Normal  Started    26m (x5 over 27m)      kubelet     Started container ingester
Warning Unhealthy 2m40s (x151 over 24m)  kubelet        Readiness probe failed: Get "http://10.251.128.25:8080/ready": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

I tried some debugging :

cortexuser@cortexvm:~/cortex-helm-chart-master$ curl https://10.251.128.25:9009/ready
curl: (7) Failed to connect to 10.251.128.25 port 8080: No route to host

With all the default config in place, I am not able to make Ingester running. With various config changes, it never succeed readiness probe. Temporarily , I removed readiness from Ingester, but that is making it running only. Still, Ingester not able to register itself with ring.

Its cortex installation phase I have encountered this issue.

You can refer #391 for more log and details.

@vk0125
Copy link
Author

vk0125 commented Aug 24, 2022

With debug mode On, It was found that the Ingester component is not able to connect with Azure Blob storage.
It was the endpoint_suffix under blocks_storage config causing the issue.
What I passed:
endpoint_suffix: "core.windows.net"
Whats expected:
endpoint_suffix: "blob.core.windows.net"

The correct config help Ingester to complete the connection with Azure and passed the readiness Probe test.

@vk0125 vk0125 closed this as completed Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant