Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Health check failing on default installation #2

Open
IronhandedLayman opened this issue Apr 4, 2017 · 3 comments
Open

Health check failing on default installation #2

IronhandedLayman opened this issue Apr 4, 2017 · 3 comments

Comments

@IronhandedLayman
Copy link

IronhandedLayman commented Apr 4, 2017

After following the given documentation I am seeing the following issue:

Name:		keepalived-cloud-provider-1765620686-d9nkt
Namespace:	kube-system
Node:		<elided/>
Start Time:	Tue, 04 Apr 2017 16:25:17 -0400
Labels:		app=keepalived-cloud-provider
		pod-template-hash=1765620686
Annotations:	kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"keepalived-cloud-provider-1765620686","uid":"ef9618e5-1973-1...
		scheduler.alpha.kubernetes.io/critical-pod=
		scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status:		Running
IP:		192.168.241.247
Controllers:	ReplicaSet/keepalived-cloud-provider-1765620686
Containers:
  keepalived-cloud-provider:
    Container ID:	docker://e3240e5d78a382155c902ee6b5cca8294b3be393959c5c3cfa1eba4d303bc66c
    Image:		quay.io/munnerz/keepalived-cloud-provider
    Image ID:		docker-pullable://quay.io/munnerz/keepalived-cloud-provider@sha256:170351533b23126b8f4eeeeb4293ec417607b762b7ae07d5c018a9cb792d1032
    Port:		
    State:		Waiting
      Reason:		CrashLoopBackOff
    Last State:		Terminated
      Reason:		Error
      Exit Code:	2
      Started:		Mon, 01 Jan 0001 00:00:00 +0000
      Finished:		Tue, 04 Apr 2017 16:29:08 -0400
    Ready:		False
    Restart Count:	5
    Requests:
      cpu:	200m
    Liveness:	http-get http://127.0.0.1:10252/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
    Environment:
      KEEPALIVED_NAMESPACE:	kube-system
      KEEPALIVED_CONFIG_MAP:	vip-configmap
      KEEPALIVED_SERVICE_CIDR:	<elided/>
    Mounts:
      /etc/ssl/certs from certs (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-jl9jx (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  certs:
    Type:	HostPath (bare host directory volume)
    Path:	/etc/ssl/certs
  default-token-jl9jx:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-jl9jx
    Optional:	false
QoS Class:	Burstable
Node-Selectors:	<none>
Tolerations:	node.alpha.kubernetes.io/notReady=:Exists:NoExecute for 300s
		node.alpha.kubernetes.io/unreachable=:Exists:NoExecute for 300s
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath					Type		Reason		Message
  ---------	--------	-----	----			-------------					--------	------		-------
<snipped/>
  41s	41s	1	kubelet, <elided/>	spec.containers{keepalived-cloud-provider}	Normal	Killing		Killing container with id docker://e3240e5d78a382155c902ee6b5cca8294b3be393959c5c3cfa1eba4d303bc66c:pod "keepalived-cloud-provider-1765620686-d9nkt_kube-system(d16932c2-1974-11e7-a2c7-0025905ca872)" container "keepalived-cloud-provider" is unhealthy, it will be killed and re-created.
  1m	1s	9	kubelet, <elided/>		spec.containers{keepalived-cloud-provider}	Warning	BackOff		Back-off restarting failed container
  41s	1s	5	kubelet, <elided/>							Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "keepalived-cloud-provider" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=keepalived-cloud-provider pod=keepalived-cloud-provider-1765620686-d9nkt_kube-system(d16932c2-1974-11e7-a2c7-0025905ca872)"

it appears to be failing a health check. Removing the health check stops the CrashLoopBackOff but obviously that's not optimal. I can't find the health check on the first pass through the code so I'm not entirely sure what's wrong.

@antoineserrano
Copy link

antoineserrano commented Aug 4, 2017

Hi @IronhandedLayman, we got nearly the same issue here, did you manage to find a workaround and/or make it work properly?

@munnerz
Copy link
Owner

munnerz commented Aug 4, 2017 via email

@tommyknows
Copy link

Same Problem for me.
I started a shell inside the container and one thing I noticed is that netstat listens on port 10253

/ # netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 :::10253                :::*                    LISTEN      1/keepalived-cloud-

However, even after changing the livenessProbe's port to 10253, it's still failing.
When testing inside the container:

/ # wget -q http://127.0.0.1:10253/healthz
/ # cat healthz 
ok/ # 

So the livenessProbe should be working...(but somehow doesn't)
Events from kubectl describe pod:

Events:
  Type     Reason                 Age                From                  Message
  ----     ------                 ----               ----                  -------
  Normal   Scheduled              3m                 default-scheduler     Successfully assigned keepalived-cloud-provider-664464fc97-bvpbj to k8s-fed-wk1
  Normal   SuccessfulMountVolume  3m                 kubelet, k8s-fed-wk1  MountVolume.SetUp succeeded for volume "certs"
  Normal   SuccessfulMountVolume  3m                 kubelet, k8s-fed-wk1  MountVolume.SetUp succeeded for volume "keepalived-token-dl7qw"
  Normal   Pulled                 2m (x2 over 3m)    kubelet, k8s-fed-wk1  Container image "quay.io/munnerz/keepalived-cloud-provider:0.0.1" already present on machine
  Normal   Created                2m (x2 over 3m)    kubelet, k8s-fed-wk1  Created container
  Normal   Started                2m (x2 over 3m)    kubelet, k8s-fed-wk1  Started container
  Warning  Unhealthy              53s (x15 over 3m)  kubelet, k8s-fed-wk1  Liveness probe failed: Get http://127.0.0.1:10253/healthz: dial tcp 127.0.0.1:10253: getsockopt: connection refused
  Normal   Killing                53s (x2 over 2m)   kubelet, k8s-fed-wk1  Killing container with id docker://keepalived-cloud-provider:Container failed liveness probe.. Container will be killed and recreated.

Note: It is the exact same error as I get with port 10252...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants