Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intranet IP outside of k3s unavailable for the 1st second of pod life #11489

Closed
morooshka opened this issue Dec 20, 2024 · 2 comments
Closed

Comments

@morooshka
Copy link

Environmental Info:
K3s Version:
k3s version v1.31.1+k3s1 (452dbbc)
go version go1.22.6

Node(s) CPU architecture, OS, and Version:
Linux consolidated-ch-3 5.15.0-125-generic #135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:
3 masters in HA mode, 3 agents

Describe the bug:
The pod launches the script, which is responsible for connecting to the DB (intranet, outside k3s cluster) and execute a script there. The script starts execution and tries to connect to DB at the very 1st second of pod's life and faces an error:

[2024-12-19, 17:29:26 UTC] {pod_manager.py:490} INFO - [base] clickhouse_driver.errors.NetworkError: Code: 210. No route to host (10.15.21.167:9000)

After debugging I have discovered, that the IP is not available only during the 1st second of pod's life, later it becomes accessible. I have written a demo which reproduces the issue on my cluster:

apiVersion: batch/v1
kind: Job
metadata:
  name: if-ip-accessible-job
  namespace: demo
spec:
  template:
    spec:
      containers:
      - name: master
        image: "docker.io/busybox:1.36"
        env:
          - name: IP_ADDRESS
            value: "10.15.21.167"
          - name: ATTEMPTS
            value: "3"
        command:
          - /bin/sh
          - "-c"
          - |2-
              is_failed=0
              cnt=0
              
              echo "If ${IP_ADDRESS} accessible ..."
            
              # Trying number of times but finally fail if any fail occurs
              while [ ${cnt} -le ${ATTEMPTS} ]
              do
                echo "$( date -u '+%FT%T' ) Attempt #${cnt} ..."
                ping -c 1 ${IP_ADDRESS}
      
                if [ $? -ne 0 ]; then
                    echo "FAILED attempt!"
                    is_failed=1
                else
                  echo "SUCCESS attempt!"
                  break
                fi
            
                true $(( cnt++ ))
              done

              if [ ${is_failed} -ne 0 ]; then
                  echo "FAILED!"
                  exit 1
              fi
            
              echo "SUCCESS!"
              exit 0
      restartPolicy: Never
  backoffLimit: 5

It produces the log like this:

If 10.15.21.167 accessible ...
2024-12-20T13:35:09 Attempt #0 ...
PING 10.15.21.167 (10.15.21.167): 56 data bytes

--- 10.15.21.167 ping statistics ---
1 packets transmitted, 0 packets received, 100% packet loss
FAILED attempt!
2024-12-20T13:35:19 Attempt #1 ...
PING 10.15.21.167 (10.15.21.167): 56 data bytes
64 bytes from 10.15.21.167: seq=0 ttl=63 time=0.273 ms

--- 10.15.21.167 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.273/0.273/0.273 ms
SUCCESS attempt!
FAILED!

Unfortunately I am unable to get any info from logs on the worker node with sudo journalctl -u k3s-agent -n 100 -f:

...
Dec 20 13:35:09 consolidated-ch-3 k3s[2841]: I1220 13:35:09.029542    2841 reconciler_common.go:245] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-s2xqq\" (UniqueName: \"kubernetes.io/projected/7761d314-8130-44de-8456-5036d4b60b0a-kube-api-access-s2xqq\") pod \"if-ip-accessible-job-fc9qx\" (UID: \"7761d314-8130-44de-8456-5036d4b60b0a\") " pod="demo/if-ip-accessible-job-fc9qx"
Dec 20 13:35:09 consolidated-ch-3 k3s[2841]: I1220 13:35:09.933052    2841 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="demo/if-ip-accessible-job-fc9qx" podStartSLOduration=1.93301954 podStartE2EDuration="1.93301954s" podCreationTimestamp="2024-12-20 13:35:08 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2024-12-20 13:35:09.928411281 +0000 UTC m=+2744912.682909355" watchObservedRunningTime="2024-12-20 13:35:09.93301954 +0000 UTC m=+2744912.687517594"
Dec 20 13:35:21 consolidated-ch-3 k3s[2841]: I1220 13:35:21.260561    2841 reconciler_common.go:159] "operationExecutor.UnmountVolume started for volume \"kube-api-access-s2xqq\" (UniqueName: \"kubernetes.io/projected/7761d314-8130-44de-8456-5036d4b60b0a-kube-api-access-s2xqq\") pod \"7761d314-8130-44de-8456-5036d4b60b0a\" (UID: \"7761d314-8130-44de-8456-5036d4b60b0a\") "
Dec 20 13:35:21 consolidated-ch-3 k3s[2841]: I1220 13:35:21.263130    2841 operation_generator.go:803] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/7761d314-8130-44de-8456-5036d4b60b0a-kube-api-access-s2xqq" (OuterVolumeSpecName: "kube-api-access-s2xqq") pod "7761d314-8130-44de-8456-5036d4b60b0a" (UID: "7761d314-8130-44de-8456-5036d4b60b0a"). InnerVolumeSpecName "kube-api-access-s2xqq". PluginName "kubernetes.io/projected", VolumeGidValue ""
Dec 20 13:35:21 consolidated-ch-3 k3s[2841]: I1220 13:35:21.361700    2841 reconciler_common.go:288] "Volume detached for volume \"kube-api-access-s2xqq\" (UniqueName: \"kubernetes.io/projected/7761d314-8130-44de-8456-5036d4b60b0a-kube-api-access-s2xqq\") on node \"consolidated-ch-3\" DevicePath \"\""
...

Steps To Reproduce:
Execute the example job above.

The cluster is installed via standard k3s script via ansible playbook. On the initial master the k3s_token is generated and set as param to other masters and agents. The flannel_iface is set on each node because I have 2 interfaces - one for intranet, one for external IPs. We use internal interfaces for the whole setup.

    - name: Install high availability initial master.
      when: master and initial == ansible_hostname
      ansible.builtin.command:
        cmd: /usr/local/bin/k3s-install.sh
      environment:
        K3S_NODE_NAME: "{{ ansible_hostname }}"
        INSTALL_K3S_VERSION: "{{ k3s_version }}"
        INSTALL_K3S_EXEC: >
          server 
          --cluster-init 
          --node-ip={{ ansible_host }} 
          --flannel-backend=vxlan 
          --flannel-iface={{ flannel_iface }} 
          --disable metrics-server

    - name: Install additional master.
      when: master and initial != ansible_hostname
      ansible.builtin.command:
        cmd: /usr/local/bin/k3s-install.sh
      environment:
        K3S_NODE_NAME: "{{ ansible_hostname }}"
        K3S_TOKEN: "{{ k3s_token }}"
        INSTALL_K3S_VERSION: "{{ k3s_version }}"
        INSTALL_K3S_EXEC: >
          server 
          --server https://{{ initial }}:6443 
          --node-ip={{ ansible_host }} 
          --flannel-backend=vxlan 
          --flannel-iface={{ flannel_iface }} 
          --disable metrics-server

    - name: Install agent.
      when: not master
      ansible.builtin.command:
        cmd: /usr/local/bin/k3s-install.sh
      environment:
        K3S_NODE_NAME: "{{ ansible_hostname }}"
        K3S_TOKEN: "{{ k3s_token }}"
        K3S_URL: "https://{{ initial }}:6443"
        INSTALL_K3S_VERSION: "{{ k3s_version }}"
        INSTALL_K3S_EXEC: >
          --node-ip={{ ansible_host }} 
          --flannel-iface={{ flannel_iface }}

Expected behavior:
IP should be available at the moment pod starts execution. The error should be reflected in logs

Actual behavior:
IP is not accessible during the 1st second of pod's execution

Additional context / logs:
Nothing to add. Will collect and add on request

@brandond
Copy link
Member

brandond commented Dec 20, 2024

Expected behavior:
IP should be available at the moment pod starts execution. The error should be reflected in logs

This is not expected behavior. This is covered in the upstream docs:
https://kubernetes.io/docs/concepts/services-networking/network-policies/

Pods must be resilient against being started up with different network connectivity than expected. If you need to make sure the pod can reach certain destinations before being started, you can use an init container to wait for those destinations to be reachable before kubelet starts the app containers.

tl;dr don't assume that pod networking has settled just because the pod is running.

@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development Dec 20, 2024
@morooshka
Copy link
Author

@brandond Thank you for your comment. Yes, I have already used an init container

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants