-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment fails when etcd servers are not members of kube_control_plane #11682
Comments
Is that reproducible with a setup like this: [kube_control_plane]
node-1
[etcd]
node-1
node-2
node-3
[kube_node]
node-1
node-2
node-3
node-4 ? (This is the |
I'll test it. But I think it will work because |
It worked on the first try:
|
Hum it looks like the conditions are:
- Separate etcd / master
- nodes are etcd clients (eg, calico using etcd store)
- maybe node != control plane ? Not sure about this one
That'd be helpful if you can test that, otherwise I'll start a PR with that as new test case when I can
|
Something like this? [all]
k8s-test1 ansible_host=192.168.0.31
k8s-test2 ansible_host=192.168.0.32 etcd_member_name=etcd1
k8s-test3 ansible_host=192.168.0.33 etcd_member_name=etcd2
k8s-test4 ansible_host=192.168.0.34 etcd_member_name=etcd3
[kube_control_plane]
k8s-test1
[etcd]
k8s-test2
k8s-test3
k8s-test4
[kube_node]
k8s-test2
k8s-test3
k8s-test4
[calico_rr]
[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr |
I was more thinking something like that
```ini
[kube_control_plane]
host1
[etcd]
host2
[kube_node]
host3
[all:vars]
network_plugin=calico
```
(If HA is not required to trigger the bug, this makes the test less expensive in CI time)
(Btw, explicit k8s_cluster is no longer required, it's dynamicly defined to the union of control-plane and node)
|
OK, i'll try it. |
I tried, but I think there is an issue if
I'll try to restore |
I think you got it -> it fails:
.. w/ this inventory: [all]
k8s-test1 ansible_host=192.168.0.31
k8s-test2 ansible_host=192.168.0.32
k8s-test3 ansible_host=192.168.0.33
[kube_control_plane]
k8s-test1
[etcd]
k8s-test2
[kube_node]
k8s-test3
[all:vars]
network_plugin=calico
[calico_rr]
[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr |
Great, thanks for testing that ! We'll need to add that to the CI in the PR to fix this so it does not regress again.
|
Hum, this is a bit weird, I apparently can't reproduce this on master (and I don't see what could have fixed this 🤔 ) |
Do you still have the issue if you use that inventory with latest master ? |
(Or latest release-2.26 for that matter I can't reproduce it either on the top of the branch 😞 |
/triage not-reproducible
(At least I can't for now)
|
What happened?
The task
Gen_certs | Gather node certs
fails with this message:In
k8s-worker1
nork8s-etcd1
, the filesnode-k8s-worker1.pem
andnode-k8s-worker1-key.pem
don't exist.What did you expect to happen?
In
k8s-etcd1
, the filesnode-k8s-worker1.pem
andnode-k8s-worker1-key.pem
must exist.How can we reproduce it (as minimally and precisely as possible)?
With 3 etcd dedicated servers.
Deploy with this command:
OS
Linux 6.1.0-26-amd64 x86_64
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
Version of Ansible
ansible [core 2.16.12]
config file = /home/me/kubespray/ansible.cfg
configured module search path = ['/home/me/kubespray/library']
ansible python module location = /home/me/ansible-kubespray/lib/python3.11/site-packages/ansible
ansible collection location = /home/me/.ansible/collections:/usr/share/ansible/collections
executable location = /home/me/ansible-kubespray/bin/ansible
python version = 3.11.2 (main, Aug 26 2024, 07:20:54) [GCC 12.2.0] (/home/me/ansible-kubespray/bin/python3)
jinja version = 3.1.4
libyaml = True
Version of Python
Python 3.11.2
Version of Kubespray (commit)
e5bdb3b
Network plugin used
cilium
Full inventory with variables
Command used to invoke ansible
ansible-playbook -f 10 -i inventory/homecluster/inventory.ini --become --become-user=root cluster.yml -e 'unsafe_show_logs=True'
Output of ansible run
Anything else we need to know
I fixed this issue like this:
k8s-etcd1
:# on k8s-etcd1 HOSTS=k8s-worker1 /usr/local/bin/etcd-scripts/make-ssl-etcd.sh -f /etc/ssl/etcd/openssl.conf -d /etc/ssl/etcd/ssl/ HOSTS=k8s-worker2 /usr/local/bin/etcd-scripts/make-ssl-etcd.sh -f /etc/ssl/etcd/openssl.conf -d /etc/ssl/etcd/ssl/ HOSTS=k8s-worker3 /usr/local/bin/etcd-scripts/make-ssl-etcd.sh -f /etc/ssl/etcd/openssl.conf -d /etc/ssl/etcd/ssl/
--tags=etcd
):ansible-playbook -f 10 -i inventory/homecluster/inventory.ini --become --become-user=root cluster.yml -e 'unsafe_show_logs=True' --tags=etcd
--tags=etcd
The text was updated successfully, but these errors were encountered: