nodes do not get deleted #6

JorritSalverda · 2017-11-20T09:43:41Z

In one of our Kubernetes Engine clusters nodes that should be deleted do not get removed properly. They're already disabled for scheduling and the pods are evicted, but then the following error is logged when the controller tries to delete the vm:

{
	"time":"2017-11-20T09:23:10Z",
	"severity":"error",
	"app":"estafette-gke-preemptible-killer",
	"version":"1.0.29",
	"error":"Delete https://www.googleapis.com/compute/v1/projects/***/zones/europe-west1-c/instances/gke-development-euro-auto-scaling-pre-33198d65-gq2m?alt=json: dial tcp: i/o timeout",
	"host":"gke-development-euro-auto-scaling-pre-33198d65-gq2m",
	"message":"Error while processing node"
}

The text was updated successfully, but these errors were encountered:

etiennetremel · 2017-11-20T15:14:08Z

can this be a timeout on GCloud side? I wasn't able to see any outage during that period, but if this node doesn't get processed, it should be on the next loop and if the error still persist, maybe there is more information from the logs right before this happen

ksuther · 2018-10-21T18:19:01Z

I'm seeing something similar happen. My guess is this might be happening because kube-dns is being killed before the GCloud client is used, so it fails to resolve the host name when authenticating.

 jsonPayload: {
  app: "estafette-gke-preemptible-killer"
  error: "Delete https://www.googleapis.com/compute/v1/projects/path/to/instance?alt=json: oauth2: cannot fetch token: Post https://oauth2.googleapis.com/token: dial tcp: lookup oauth2.googleapis.com on 10.114.0.10:53: dial udp 10.114.0.10:53: connect: network is unreachable"
  host: "test-pool-cb8bed09-17s6"
  message: "Error deleting GCloud instance"
  version: "1.0.35"
 }

JorritSalverda · 2018-10-22T15:00:11Z

Although kube-dns - if present on the node - is actively deleted by https://github.com/estafette/estafette-gke-preemptible-killer/blob/master/main.go#L296 since kube-dns is running HA this shouldn't be an issue.

However it does turn out that kubernetes engine - built to be resilient - isn't very resilient in the light of preemptions. The master doesn't update services with pods on a preempted node fast enough to no longer send traffic there. We've seen this by having frequent kube-dns issues correlating with real preemptions by Google, not the ones issued by our preemptible-killer.

jstephens7 · 2018-12-04T19:41:23Z

@JorritSalverda We're getting dns errors intermittently on our GKE preemptibles (with preemptible killer) with services in the cluster trying to resolve other services in the same cluster.
EDIT: It should be noted that we're only having these intermittent connection issues with our preemptibles, the other nodes are having no issues.
I'm asking out of ignorance:
What is the purpose of removing kube-dns on the node?
Would leaving kube-dns on the node remove the dns issues?
And could you clarify your last statement: "We've seen this by having frequent kube-dns issues correlating with real preemptions by Google, not the ones issued by our preemptible-killer."

JorritSalverda · 2018-12-05T09:37:48Z

@jstephens7 we've seen the same and actually moved away from preemptibles for the time being. It's unrelated to this controller, but happens when a node really gets preempted by Google before this controller would do it instead. GKE doesn't handle preemption gracefully, but just kill the node at once. This leaves the Kubernetes master in the blind for a while until it discovers that the node is no longer available. In the mean time the iptables don't get updated and traffic still gets routed to the unavailable node. I would expect this scenario to be handled better, since you want Kubernetes to be resilient in the face of real node malfunction.

For AWS there's actually a notifier that warns you a spot instance is going down, but GCP doesn't have such a thing currently. See https://learnk8s.io/blog/kubernetes-spot-instances for more info.

theallseingeye · 2019-02-26T10:47:50Z

Seems like this could be a good solution https://github.com/GoogleCloudPlatform/k8s-node-termination-handler

tmirks · 2019-07-20T18:00:17Z

@JorritSalverda have you completely given up on preemptibles in production (because of this issue)? Just exploring the idea so would love to hear your feedback.

And would @theallseingeye's suggestion mitigate this?

santinoncs · 2020-10-16T17:18:05Z

When deleting node, I am experiencing this error

INF Done draining kube-dns from node host=gke-xxxxx
ERR Error deleting GCloud instance error="Delete "https://www.googleapis.com/compute/v1/projects/yyyyyy/zones/europe-west1-b/instances/gke-xxxxxx?alt=json\": oauth2: cannot fetch token: Post "https://oauth2.googleapis.com/token\": x509: certificate signed by unknown authority" host=gke-xxxxx
ERR Error while processing node error="Delete "https://www.googleapis.com/compute/v1/projects/yyyyyy/zones/europe-west1-b/instances/gke-xxxxxx?alt=json\": oauth2: cannot fetch token: Post "https://oauth2.googleapis.com/token\": x509: certificate signed by unknown authority" host=gke-xxxx

I would say that my serviceaccount json is well upload to the pod , and the account has the proper permissions..so I dont know what is happening

JorritSalverda · 2020-10-19T07:47:43Z

Hi @santinoncs, do you use the Helm chart? And what version? We run it with a service account with roles compute.instanceAdmin.v1 on the project the GKE cluster is in. That seems to work fine.

JorritSalverda · 2020-10-19T07:58:30Z

Hi @tmirks we did abandon preemptibles for a while since the pressure on europe-west1 mounted and preemptions became more commonplace. The fact that GKE wasn't aware of preemptions caused a lot of trouble with kube-dns requests getting sent to no longer existing pods. Now we're testing the k8s-node-termination-handler - see Helm chart at https://github.com/estafette/k8s-node-termination-handler - with this application to ensure both GKE is aware of preemptions and preemptions are less likely to happen all at once. Spreading preemptible nodes across zones should also help in reduce changes on mass preemptions.

santinoncs · 2020-10-19T09:12:06Z

Hi @santinoncs, do you use the Helm chart? And what version? We run it with a service account with roles compute.instanceAdmin.v1 on the project the GKE cluster is in. That seems to work fine.

Already working when I copy the ca-certificates file to the container.

vikstrous2 · 2021-09-07T14:35:20Z

Just FYI, GKE now handles node preemption gracefully, giving pods about 25 seconds to shut down.

andrei-pavel mentioned this issue Jan 21, 2021

[📦 NEW] -m, --minimum-kill-interval #69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nodes do not get deleted #6

nodes do not get deleted #6

JorritSalverda commented Nov 20, 2017

etiennetremel commented Nov 20, 2017

ksuther commented Oct 21, 2018

JorritSalverda commented Oct 22, 2018

jstephens7 commented Dec 4, 2018 •

edited

Loading

JorritSalverda commented Dec 5, 2018

theallseingeye commented Feb 26, 2019

tmirks commented Jul 20, 2019 •

edited

Loading

santinoncs commented Oct 16, 2020 •

edited

Loading

JorritSalverda commented Oct 19, 2020

JorritSalverda commented Oct 19, 2020

santinoncs commented Oct 19, 2020

vikstrous2 commented Sep 7, 2021

nodes do not get deleted #6

nodes do not get deleted #6

Comments

JorritSalverda commented Nov 20, 2017

etiennetremel commented Nov 20, 2017

ksuther commented Oct 21, 2018

JorritSalverda commented Oct 22, 2018

jstephens7 commented Dec 4, 2018 • edited Loading

JorritSalverda commented Dec 5, 2018

theallseingeye commented Feb 26, 2019

tmirks commented Jul 20, 2019 • edited Loading

santinoncs commented Oct 16, 2020 • edited Loading

JorritSalverda commented Oct 19, 2020

JorritSalverda commented Oct 19, 2020

santinoncs commented Oct 19, 2020

vikstrous2 commented Sep 7, 2021

jstephens7 commented Dec 4, 2018 •

edited

Loading

tmirks commented Jul 20, 2019 •

edited

Loading

santinoncs commented Oct 16, 2020 •

edited

Loading