Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI Controller will only create volumes in its own region again #822

Open
bascht opened this issue Dec 20, 2024 · 0 comments
Open

CSI Controller will only create volumes in its own region again #822

bascht opened this issue Dec 20, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@bascht
Copy link

bascht commented Dec 20, 2024

TL;DR

hcloud-csi-controller will spawn new PVC / PV only in it's current region, even if leads to an unscheduleable Pod.

Expected behavior

Picking up on my own ancient issue #42, I think #780 brought a regression in that it's currently impossible to schedule any pods with a PVC that are not in the same csi.hetzner.cloud/location as the csi controller.

The documentation makes it sound like this would only be the "default" location. Instead it is the only location where pods can be scheduled.

Observed behavior

Checking the UI show the volume in fsn1, even though the VolumeBindingMode is WaitForFirstConsumer, which in my understanding should lead to the volume waiting for the pod (which is scheduled on a topology.kubernetes.io/region=nbg1 node):

Screenshot-2024-12-20-084125

Minimal working example

Two Deployments, one in fsn1, one in nbg1.

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-fsn1
  namespace: loadtest
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/region
                operator: In
                values:
                - fsn1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-fsn1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-fsn1
      volumes:
      - name: ubuntu-fsn1
        persistentVolumeClaim:
          claimName: ubuntu-fsn1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-fsn1
  labels:
    app: csi-test
spec:
  storageClassName: hcloud-volumes
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ubuntu-nbg1
  namespace: loadtest
spec:
  selector:
    matchLabels:
      app: csi-test
  replicas: 1
  template:
    metadata:
      labels:
        app: csi-test
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.kubernetes.io/region
                operator: In
                values:
                - nbg1
      containers:
      - image: ubuntu:xenial
        name: ubuntu-nbg1
        stdin: true
        tty: true
        volumeMounts:
        - mountPath: /mnt
          name: ubuntu-nbg1
      volumes:
      - name: ubuntu-nbg1
        persistentVolumeClaim:
          claimName: ubuntu-nbg1
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ubuntu-nbg1
  labels:
    app: csi-test
spec:
  storageClassName: hcloud-volumes
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi

With our current CSI driver in fsn1 this will lead to an unschedulable pod for ubuntu-fsn1.

Log output

NAME                           READY   STATUS    RESTARTS   AGE
ubuntu-fsn1-756875b699-kg9gf   1/1     Running   0          90s
ubuntu-nbg1-bddb957f7-hpw5g    0/1     Pending   0          90s

Events:

Warning  FailedScheduling  101s  default-scheduler  running PreBind plugin "VolumeBinding": binding volumes: pv "pvc-ea316f22-3c12-4dcb-ae03-1600ec14fc46" node affinity doesn't match node "<my-node-name>": no matching NodeSelectorTerms

 Warning  FailedScheduling  98s   default-scheduler  0/16 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/master: }, 2 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 4 node(s) had volume node affinity conflict, 9 node(s) didn't match Pod's node affinity/selector. preemption: 0/16 nodes are available: 16 Preemption is not helpful for scheduling..


### Additional information

Kubernetes v1.27.3
Hcloud CSI 2.11.0

Let me know if I can supply more details.
@bascht bascht added the bug Something isn't working label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant