CSI Controller will only create volumes in its own region again #822

bascht · 2024-12-20T07:46:39Z

TL;DR

hcloud-csi-controller will spawn new PVC / PV only in it's current region, even if leads to an unscheduleable Pod.

Expected behavior

Picking up on my own ancient issue #42, I think #780 brought a regression in that it's currently impossible to schedule any pods with a PVC that are not in the same csi.hetzner.cloud/location as the csi controller.

The documentation makes it sound like this would only be the "default" location. Instead it is the only location where pods can be scheduled.

Observed behavior

Checking the UI show the volume in fsn1, even though the VolumeBindingMode is WaitForFirstConsumer, which in my understanding should lead to the volume waiting for the pod (which is scheduled on a topology.kubernetes.io/region=nbg1 node):

Minimal working example

Two Deployments, one in fsn1, one in nbg1.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-fsn1
  namespace: loadtest
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/region
                operator: In
                values:
                - fsn1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-fsn1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-fsn1
      volumes:
      - name: ubuntu-fsn1
        persistentVolumeClaim:
          claimName: ubuntu-fsn1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-fsn1
  labels:
    app: csi-test
spec:
  storageClassName: hcloud-volumes
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-nbg1
  namespace: loadtest
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/region
                operator: In
                values:
                - nbg1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-nbg1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-nbg1
      volumes:
      - name: ubuntu-nbg1
        persistentVolumeClaim:
          claimName: ubuntu-nbg1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-nbg1
  labels:
    app: csi-test
spec:
  storageClassName: hcloud-volumes
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

With our current CSI driver in fsn1 this will lead to an unschedulable pod for ubuntu-fsn1.

Log output

NAME                           READY   STATUS    RESTARTS   AGE
ubuntu-fsn1-756875b699-kg9gf   1/1     Running   0          90s
ubuntu-nbg1-bddb957f7-hpw5g    0/1     Pending   0          90s

Events:

Warning  FailedScheduling  101s  default-scheduler  running PreBind plugin "VolumeBinding": binding volumes: pv "pvc-ea316f22-3c12-4dcb-ae03-1600ec14fc46" node affinity doesn't match node "<my-node-name>": no matching NodeSelectorTerms

 Warning  FailedScheduling  98s   default-scheduler  0/16 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 2 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 4 node(s) had volume node affinity conflict, 9 node(s) didn't match Pod's node affinity/selector. preemption: 0/16 nodes are available: 16 Preemption is not helpful for scheduling..



### Additional information

Kubernetes v1.27.3
Hcloud CSI 2.11.0

Let me know if I can supply more details.

The text was updated successfully, but these errors were encountered:

bascht added the bug Something isn't working label Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSI Controller will only create volumes in its own region again #822

CSI Controller will only create volumes in its own region again #822

bascht commented Dec 20, 2024

CSI Controller will only create volumes in its own region again #822

CSI Controller will only create volumes in its own region again #822

Comments

bascht commented Dec 20, 2024

TL;DR

Expected behavior

Observed behavior

Minimal working example

Log output