rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected - using Seaweedfs with Nomad #147

Lukas8342 · 2023-12-08T15:56:32Z

Hello,

I'm encountering an issue that I'm unsure whether it stems from SeaweedFS, SeaweedFS-csi-driver or HashiCorp Nomad. I'm reaching out here as a starting point, hoping for guidance as my troubleshooting options are running thin. In my current setup, I have one master, one filer, and one volume server, all running on the same machine with these configurations:

bash

weed master -ip=85.215.193.71 -ip.bind=0.0.0.0 -mdir=/seatest/m -port=9333 -port.grpc=19333
weed volume -mserver=85.215.193.71:9333 -dir=/seatest/d -dataCenter=dc1 -ip=85.215.193.71 -max=30 -ip.bind=0.0.0.0 -port=8080 -port.grpc=18080
weed filer ip=85.215.193.71 -master=85.215.193.71:9333 -dataCenter=dc1 -rack=rack1

When utilizing the CSI with the following Nomad job:

hcl

job "seaweedfs-plugin" {
  datacenters = ["dc1"]
  type        = "system"

  constraint {
    operator = "distinct_hosts"
    value    = true
  }

  group "nodes" {
    task "plugin" {
      driver = "docker"

      config {
        image      = "chrislusf/seaweedfs-csi-driver"
        privileged = true

        args = [
          "--endpoint=unix://csi/csi.sock",
          "--filer=10.7.230.11:8888",
          "--nodeid=${node.unique.name}",
          "--cacheCapacityMB=1000",
          "--cacheDir=/tmp",
        ]
      }

      csi_plugin {
        id        = "seaweedfs"
        type      = "monolith"
        mount_dir = "/csi"
      }
    }
  }
}

It initially appears to work, but upon running jobs with different images, I consistently encounter a "Transport endpoint is not connected" error.

The filer logs display the following when starting a job and mounting it to a volume:

bash

I1208 15:45:26.862162 filer_grpc_server_sub_meta.go:268 => client [email protected]:52516: rpc error: code = Unavailable desc = transport is closing
E1208 15:45:26.862195 filer_grpc_server_sub_meta.go:78 processed to 2023-12-08 15:45:26.861541202 +0000 UTC: rpc error: code = Unavailable desc = transport is closing
I1208 15:45:26.862584 filer_grpc_server_sub_meta.go:312 -  listener [email protected]:52516 clientId -399912238 clientEpoch 2
I1208 15:45:26.862933 filer_grpc_server_sub_meta.go:296 +  listener [email protected]:54900 clientId -1540680978 clientEpoch 2
I1208 15:45:26.862949 filer_grpc_server_sub_meta.go:36  [email protected]:54900 starts to subscribe /buckets/dat from 2023-12-08 15:45:26.862037157 +0000 UTC

Nomad volume mounting is done as follows:

hcl

job "sonatype-nexus" {
  datacenters = ["dc1"]

  group "nexus" {
    count = 1
    network {
      port "http" {
        static = 8081
      }
    }

    volume "vol" {
      type           = "csi"
      read_only      = false
      source         = "nexus-volume"
      access_mode    = "single-node-writer"
      attachment_mode = "file-system"
    }

    task "server" {
      driver = "docker"
      volume_mount {
        volume      = "vol"
        destination = "/nexus-data"
        read_only   = false
      }

      config {
        image = "sonatype/nexus3:latest"
        ports = ["http"]
      }

      resources {
        cpu    = 2000
        memory = 4000
      }
    }
  }
}

I appreciate any insights or guidance you can provide to help resolve this issue.

Thank you.

The text was updated successfully, but these errors were encountered:

worotyns · 2024-01-13T10:53:50Z

Same here on nomad, csi-plugin logs also:
panic: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined

On image :
chrislusf/seaweedfs-csi-driver:v1.1.8

Works fine :) - It's look like this changes cause a problem:
785e69a

qskousen-membersy · 2024-02-09T20:49:36Z

Can confirm that latest image is broken for me on Nomad with the error messages referencing Kubernetes, and that using v1.1.8 as @worotyns suggested works.

chrislusf · 2024-02-24T14:51:40Z

@duanhongyi please take a look here.

duanhongyi · 2024-02-27T08:02:34Z

@chrislusf
It seems to be incompatible with Nomad.
KUBERNETES-SERVICE_HOST env does not exist in Nomad.

Let me take a look in the next few days.

nahsi · 2024-06-04T15:28:55Z

Still broken in the latest version.

I think this commit has completely broke this CSI driver
785e69a#diff-d7f330f6d6efcabc25613925c10237045948e05bc020c7ecf16c3b331e371e62

chrislusf · 2024-06-04T16:31:11Z

Send a PR to revert this change?

duanhongyi · 2024-06-05T10:53:36Z

@chrislusf

I think it can be downgraded, that is, Nomad CSI does not support limited capacity.

Is this feasible? This modification is the simplest. I currently do not have a Nomad cluster to experiment with.

The pseudocode is as follows, mainly looking at the maxVolumeSize variable:

func GetVolumeCapacity(volumeId string) (int64, error) {
	client, err := NewInCluster()
	if err != nil {
		return maxVolumeSize, nil
	}
	ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
	defer cancel()

	if volume, err := client.CoreV1().PersistentVolumes().Get(ctx, volumeId, metav1.GetOptions{}); err != nil {
		return 0, err
	} else {
		storage := volume.Spec.Capacity.Storage()
		capacity, _ := storage.AsInt64()
		return capacity, nil
	}
}

duanhongyi · 2024-06-05T11:04:47Z

I have looked at Nomad's API and it is not the standard K8S API; So the simplest way to fix it is to ignore obtaining the capacity of nomad PVC and directly return the maximum value.

https://developer.hashicorp.com/nomad/api-docs/volumes

If this is feasible, I will submit a PR tomorrow.

duanhongyi · 2024-06-06T10:42:07Z

#168

Lukas8342 changed the title ~~rpc error: code = Unavailable desc = transport is closing~~ rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected Dec 8, 2023

Lukas8342 changed the title ~~rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected~~ rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected - using Seaweedfs with Nomad Dec 8, 2023

fred-gb mentioned this issue Jun 20, 2024

CSI Failed with Nomad #169

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected - using Seaweedfs with Nomad #147

rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected - using Seaweedfs with Nomad #147

Lukas8342 commented Dec 8, 2023

worotyns commented Jan 13, 2024 •

edited

Loading

qskousen-membersy commented Feb 9, 2024

chrislusf commented Feb 24, 2024

duanhongyi commented Feb 27, 2024 •

edited

Loading

nahsi commented Jun 4, 2024 •

edited

Loading

chrislusf commented Jun 4, 2024

duanhongyi commented Jun 5, 2024 •

edited

Loading

duanhongyi commented Jun 5, 2024

duanhongyi commented Jun 6, 2024

rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected - using Seaweedfs with Nomad #147

rpc error: code = Unavailable desc = transport is closing => Transport Endpoint Not Connected - using Seaweedfs with Nomad #147

Comments

Lukas8342 commented Dec 8, 2023

worotyns commented Jan 13, 2024 • edited Loading

qskousen-membersy commented Feb 9, 2024

chrislusf commented Feb 24, 2024

duanhongyi commented Feb 27, 2024 • edited Loading

nahsi commented Jun 4, 2024 • edited Loading

chrislusf commented Jun 4, 2024

duanhongyi commented Jun 5, 2024 • edited Loading

duanhongyi commented Jun 5, 2024

duanhongyi commented Jun 6, 2024

worotyns commented Jan 13, 2024 •

edited

Loading

duanhongyi commented Feb 27, 2024 •

edited

Loading

nahsi commented Jun 4, 2024 •

edited

Loading

duanhongyi commented Jun 5, 2024 •

edited

Loading