Skip to content
This repository has been archived by the owner on Jul 11, 2023. It is now read-only.

What would break if we moved to Deployments? #207

Closed
gcapizzi opened this issue Apr 19, 2021 · 19 comments
Closed

What would break if we moved to Deployments? #207

gcapizzi opened this issue Apr 19, 2021 · 19 comments
Assignees
Labels
Milestone

Comments

@gcapizzi
Copy link
Contributor

gcapizzi commented Apr 19, 2021

There is lot of interest in having Eirini use Deployments for its workloads instead of StatefulSets (see cloudfoundry/eirini#94 and cloudfoundry/cf-for-k8s#600). We suspect a bunch of things would break if we did it, but we don't really know which. Can we figure this out? One way could be to do the change as cheaply as possible, and then run CATs.

We expect anything that relies on instance indexes to break, like:

  • instance-specific operations like killing a single instance
  • the INSTANCE_INDEX environment variable
  • Eirinix extensions

but we don't know much about wether the rest of CF would keep working.

The goal of this is to understand if this could be an optional feature users can enable if they're willing to give up on a subset of CF features, hoping that subset is as small as possible.

@gcapizzi gcapizzi added the spike label Apr 19, 2021
@gcapizzi gcapizzi added this to the `Deployment`s milestone Apr 19, 2021
mnitchev added a commit to cloudfoundry/eirini that referenced this issue Apr 19, 2021
mnitchev added a commit that referenced this issue Apr 19, 2021
@danail-branekov
Copy link
Member

One of the things we did not think about was metrics.
Recently we PRed metric-proxy to emit disk usage metric per instance. The way do get the instance id is to parse the pod name

When switching to deployments, the instance ID in the metric ends up as a string with letters. This is not a problem for metrics themselves, however, CC assumes that this is a number. Failing to use the value as a number raises an exception (or whatever it is called in Ruby) which is rescued in a very generic manner Thus the CF cli user gets the misleading error that the stats server is not available.

mnitchev added a commit to cloudfoundry/eirini that referenced this issue Apr 21, 2021
This is a very naive implementation of this maaping: we get all the
pods, sort their names alphabetically and map the sorted names to
ordered indexes. This approach gives us stable mapping but does not
guarantee that during scaling up instance X would be older than instance
X-1

[cloudfoundry/eirini-release#207]

Co-authored-by: Mario Nitchev <[email protected]>
danail-branekov added a commit that referenced this issue Apr 21, 2021
danail-branekov added a commit to eirini-forks/cloud_controller_ng that referenced this issue Apr 21, 2021
When using deployments in Eirini for workloads, the metrics proxy would
send the pod name suffix (which is not a number) as instance id. In
order to convert it to a number, we naively unique and sort those instance
ids and map them to their order.

[cloudfoundry/eirini-release#207]

Co-authored-by: Mario Nitchev <[email protected]>
@danail-branekov
Copy link
Member

danail-branekov commented Apr 21, 2021

With the spike branches[1] we managed to:

  • Run all Eirini EATS tests without any modifications
  • Run apps CATS

The only issue we faced was to provide a mapping between pod names and instance
index.

In cloud controller instance indexes are expected to be numbers or strings with
numbers only. In order to ensure this we implemented a naive change where

  • in CC we sort all non-numeric instance_ids and map them to order index
  • in Eirini when restarting a specific instance id (i.e. POST /apps/:process_guid/:version_guid/stop/:instance) we resolve the instance
    index to pod name via sorting the pods alphabetically and pick the one that
    corresponds to that index

The approach above reliably works when restarting an instance, i.e. if you
cfrestart-app-instance <app> 2, you kill the instance that is labeled as
instance 2 in cf app <app>. However, when the new instance is created, the
instances order is very likely to change as pods get random guids. Therefore
restarting instances in parallel, or restarting instances while scaling might
have unexpected results. Obviously this approach is not good enough, it just
proves that if we had a way to map pod guids to instance id, then migrating to
deployments seems feasible (if you are happy to break eirinix and whoever is
relying on statefulsets for workloads).

A stable and reliable way to map guids to numbers would be treating guids as
numbers in a base36 number system (26 characters + 10 digits = base36), we
could easily covert those to decimal numbers. That would be a bit strange from
cf cli UI point of view, as when you scale to e.g. two instances, users would
see e.g. instance 100 and 42323. Once they kill instance 100, they would
get another instance with id e.g. 134. Note that the maximum value of a
5-character number (pod guids consist of 5 characters) in a base36 number
system is pow(36, 5) = 60466176

[1] Spike branches:
https://github.com/cloudfoundry-incubator/eirini-release/tree/spike-deployments
https://github.com/cloudfoundry-incubator/eirini/tree/spike-deployments
https://github.com/eirini-forks/cloud_controller_ng/tree/spike-deployments

@gcapizzi
Copy link
Contributor Author

gcapizzi commented Apr 21, 2021

@danail-branekov @mnitchev do those indexes have to be unique? Or can we get away with 0s? I guess that would prevent us from doing instance-specific operations though, while coming up with a unique ID, as big as it is, would maintain those.

@danail-branekov
Copy link
Member

If we use zeroes, then certain things in the CF cli UI would not work:

  • All the metrics will be visualised in instance 0, rest of the instances would have no metrics
  • You would not be able to kill a specific instance (i.e. you will not be able to kill an instance that is misbehaving)
  • We have to give that a try, but cf app might be always showing a single instance regardless the app scaling as CC is using maps to aggregate data per instance (if indexes are all zeros, you would only get a single zero entry in those maps)

@gcapizzi
Copy link
Contributor Author

I see, it looks like an objectively inferior solution. Good catch on the mapping! As weird as it might look it should do the job pretty well!

@danail-branekov
Copy link
Member

As weird as it might look it should do the job pretty well!

We could even inject that number in the CF_INSTANCE_INDEX env variable, that would however break use cases where apps want to run something on a specific instance, e.g. initialise stuff once only on instance with index 0.

@gcapizzi
Copy link
Contributor Author

gcapizzi commented Apr 21, 2021

We should definitely try this. Note: we probably want the range to start from 1 so that no instance ever gets index 0. If someone is running an app that behaves differently on ENV['CF_INSTANCE_INDEX'] == 0 we don't want that to randomly happen with a 0,000001653817169% chance 😅 it's better if it just never happens!

mnitchev added a commit to eirini-forks/cloud_controller_ng that referenced this issue Apr 23, 2021
mnitchev added a commit to cloudfoundry/eirini that referenced this issue Apr 23, 2021
The uid is alphanumeric, which means it can be interpreted as a base-36
number and converted to an integer

[cloudfoundry/eirini-release#207]

Co-authored-by: Mario Nitchev <[email protected]>
mnitchev added a commit that referenced this issue Apr 23, 2021
[#207]

Co-authored-by: Mario Nitchev <[email protected]>
@mnitchev mnitchev removed their assignment Apr 23, 2021
@danail-branekov danail-branekov self-assigned this Apr 23, 2021
danail-branekov added a commit that referenced this issue Apr 23, 2021
to adopt always getting popd guid as last 5 characters

[#207]

Co-authored-by: Georgi Sabev <[email protected]>
@danail-branekov
Copy link
Member

Today we wanted to give the idea of treating pod 5-character guids as base36 numbers and convert them to decimals whereever needed.
This approach generally works but we had to implement several workarounds:

  • There are few places in CC where it assumes that instance IDs are sequential numbers and their max value is not greater than the instances count. We had to remove this assumption as there is no ordering in pod guids
  • It turned out that when the pod name becomes longer than a certain length (internet speculates that a pod name max length is 63 characters), in order to guarantee pod names uniquness within a deployment, K8S would take the first <max_len> - 5 characters of the desired name and append the GUID. This means that getting the pod GUID is all about getting the last 5 characters, instead of getting everything after the last - as the last - might get dropped. The proper place to implement this logic would be in Eirini (where it resolves instance ID to pod) and metrics-proxy where the instance ID is sent as part of the disk usage metric.

All the above has been pushed to related spike branches. Here is how cf app <app_name> looks with the base36 IDs:

❯ cf app catnip
Showing health and status for app catnip in org o / space s as admin...

name:                catnip
requested state:     started
isolation segment:   placeholder
routes:              catnip.apps.vcap.me
last uploaded:       Fri 23 Apr 17:39:41 EEST 2021
stack:
buildpacks:
isolation segment:   placeholder

type:           web
sidecars:
instances:      4/4
memory usage:   1024M
            state     since                  cpu    memory      disk       details
#13244386   running   2021-04-23T15:13:39Z   0.0%   3.6M of 0   28K of 0
#26770803   running   2021-04-23T15:13:39Z   0.0%   3.6M of 0   28K of 0
#44864610   running   2021-04-23T14:41:21Z   0.0%   3.9M of 0   28K of 0
#59576530   running   2021-04-23T15:13:39Z   0.0%   3.5M of 0   28K of 0

type:           catnip
sidecars:
instances:      0/0
memory usage:   1024M
There are no running instances of this process.

type:           worker
sidecars:
instances:      0/0
memory usage:   1024M
There are no running instances of this process.

There are few CATs that are failing because they expect that instance IDs are 0, 1, 2, 3....:

[Fail] [apps] Getting instance information Scaling instances [It] can be queried for state by instance
/home/danailster/workspace/cf-acceptance-tests/apps/instance_reporting.go:79

[Fail] [apps] Application Lifecycle pushing multiple instances [It] is able to start all instances
/home/danailster/workspace/cf-acceptance-tests/apps/lifecycle.go:220

[Fail] [apps] Application Lifecycle pushing multiple instances [It] is able to restart an instance
/home/danailster/workspace/cf-acceptance-tests/apps/lifecycle.go:262

@danail-branekov
Copy link
Member

danail-branekov commented Apr 23, 2021

In conclusion, we believe that there are two ways to to address instance ID being a non-numeric character

  1. Treat pod guids as base36 numbers and convert them to decimals

    • Pros:
      • minimal change in Eirini/CC/metric-proxy only
      • seems to work
    • Cons
      • It is sort of a hack
      • instance ids should be treated as GUIDs rather than ordered numbers. This might be confusing for users, especially those who are used to the current CF cli experience, moreover the instance IDs will not contain characters to hint that this is a GUID
  2. Change the instance id data type from int32 to string and treat instance IDs as guids

    • Pros
      • Looks like to be the more proper change as we would not have to covert base36 numbers
        just for the sake of type safety
      • CF cli would display instance IDs as the real guids which could help operators map instances to K8S pods
    • Cons
      • Greater change, might possibly invlove changes in Diego as well because of changing the data type in the CC protobuf types
      • InstanceIDs are part of the loggregator metrics, theregore if loggregator clients (such as Grafana) are treating them as numbers, moving to GUIDs would require them to adopt the change as well

@gcapizzi
Copy link
Contributor Author

gcapizzi commented Apr 26, 2021

I think option 1 would be a great first step. The necessary work would be:

  1. Add a Deployments mode to Eirini, using the base36 mapping for instance indexes.
  2. Add a Deployments mode to metric-proxy, using the base36 mapping for instance indexes (we think the only needed change would be in this function).
  3. Tweak Cloud Controller to handle sparse indexes.

We'll check with the Cloud Controller and metrics-proxy teams and then create issues for this.

@gcapizzi
Copy link
Contributor Author

gcapizzi commented Apr 30, 2021

We might want to reopen this to:

  • fully assess the consequences of using alphanumeric IDs (we know it would break more stuff, but what exactly?)
  • double-check places were index are assumed to be in the [0, n) range (e.g. see this comment)

@gcapizzi
Copy link
Contributor Author

gcapizzi commented May 3, 2021

Other interesting idea: return (sparse) integer indexes for backwards compatibility but adding a new field for alphanumeric IDs so that components can gradually migrate to it.

@gcapizzi gcapizzi reopened this May 5, 2021
@gcapizzi
Copy link
Contributor Author

gcapizzi commented May 5, 2021

Another place to check is log-cache, according to this comment.

@georgethebeatle
Copy link
Member

In order to asses the consequences of supporting alphanumeric instance ids we decided to take an experimental approach where we make the desired change in the cheapest possible way and run CATS to see what breaks. We have decided to limit the scope of the experiment to the cf-for-k8s world, but still making sure that the cf for VMs still works in the old way (numeric instance ids)
Here is what we found:

  • It is most convenient to hide the whole change behind a feature flag. This is necessary because the CC need to know how to interpret the instance ids in each case (int or string). For diego it will use the old int ids, while for eirini with deployments turned on it is going to use the new alphanumeric ids.
  • Obviously eirini would need to change what objects it creates (deployments vs statefulsets) and the way it handles the instance ids. This behaviour will be determined by the same feature flag as the CC.
  • There is a minimal change in the metric proxy to parse out the instance id from the Deployment pod name
  • The CLI should be changed to handle both ints and strings as the instance id.
  • For more details take a look at the spike-deployments branches [1] in each affected reporisotry

In order to deploy the prototype you need to chek out the complete list of spike branches and run the following command from the root of the eirini repository.

./scripts/cf4k8s-on-kind.sh -emc

Another place to check is log-cache, according to this comment.

As part of the experiment we did not have to change log cache and did not observe any cats failures. Maybe we should ask the comment author for more context.

double-check places were index are assumed to be in the [0, n) range

A part of the experiment we just removed them and did not have significant issues. Sometimes an instance would take longer to disappear from the cf app output but eventually it would be fine. We could ask the CC people for more context about why these checks were introduced.

In order to validate that diego still works with the changes in CC we have deployed a cf fr VMs and are running CATS against it.
The deployment is available under directors in the garden-ci repo. It is a light-me-up environment using a local capi-release that contains the changes in CC. We should destroy that environment when we are done testing.

[1]

@danail-branekov
Copy link
Member

Good news, I managed to successfully run CATS on a diego deployment that contains cloud controller changes discussed above. The setup (currently deployed as spike-deployments in the gaden-ci lite-me-up deployment) I used was:

{
    "api": "api.<redacted>",
    "admin_user": "admin",
    "admin_password": "<redacted>",
    "apps_domain": "<redacted>",
    "cf_push_timeout": 600,
    "default_timeout": 360,
    "skip_ssl_validation": true,
    "timeout_scale": 1,
    "include_apps": true,
    "include_backend_compatibility": false,
    "include_deployments": false,
    "include_detect": true,
    "include_docker": false,
    "include_internet_dependent": true,
    "include_private_docker_registry": false,
    "include_route_services": false,
    "include_routing": true,
    "include_service_discovery": false,
    "include_service_instance_sharing": false,
    "include_services": false,
    "include_tasks": false,
    "include_v3": false,
    "stacks": [
        "cflinuxfs3"
    ]
}

Thus we proved that:

  • Introduce an additional property for alphanumeric instance ids that gets set when Eirini uses deployments
  • We should be able to configure deployments vs statefulsets via feature flags in Eirini and CC
  • The CLI should be aware how where how to the instance id (whether to use the numeric or the alphanumeric value) from the application stats. This has the drawback that older versions of the CLI would not be able to work properly with CF deployments having eirini deployments feature turned on, however they would still operate correctly with diego/eirini statefulsets CF deployments. On other words, before eventually switching to Eirini deployments, operators will have to make sure that all their fellow developers have bumped their CLI.

Having proved that backwards compatibility is ensured, we believe that

  • we have found out all the places we need to change to support deployments
  • we have a rough idea how to do that
  • we are confident that components that do not rely on internal Eirini details (such as statefulsets) and numeric stable instance IDs should be fine

Things that would be broken

  • EiriniX extensions (they create webhooks that alter the LRP statefulsets, obviously replacing statefulsets with deployments will break them)
  • Applications that are assuming that the CF_INSTANCE_ID env variable is numeric
  • Applications that perform certain actions when CF_INSTANCE_ID is a particular value, for example initialise stuff on instance 0 only

mnitchev added a commit to cloudfoundry/eirini that referenced this issue May 14, 2021
@gcapizzi
Copy link
Contributor Author

@danail-branekov can we prepare a list of CF components that would need to change and which changes would be needed?

@danail-branekov
Copy link
Member

@gcapizzi I think we can derive the answer from the spike branches listed above

We need to change Eirini, Cloud Controller, Metric Proxy and the CF CLI. The spike branches outline the changes needed, of course they should be implemented in a proper way when doing that for realz.
The only change not visible in the spike branches is introducing a feature flag in both Eirini and CC but that should be trivial.

Did I answer your question?

@gcapizzi
Copy link
Contributor Author

gcapizzi commented May 17, 2021

@danail-branekov I see we only use a separate field (instance_uid) for metrics in CC, but maybe the best way to go is to have separate fields wherever possible.

What about something like this:

  • Adding a use_deployments feature flag to Eirini which
    • makes it use Deployments instead of StatefulSets
    • makes it set a separate instance UID field instead of the instance index
  • Adding a use_instance_uids feature flag to the Cloud Controller which makes it use, and expect, instance UIDs to/from Eirini and to/from the CLI
    • this would include skipping a bunch of index range checks that we're still not sure how to replace
  • Adding a use_deployments feature flag to the metrics-proxy which makes it aware that we're collecting metrics from Deployments, not StatefulSets
    • this suffers the fact that Eirini's and metric-proxy's ways of extracting the instance index/UID from the Pod name are strictly coupled
  • Changing the CLI to
    • display indexes if present, UIDs otherwise
    • either
      • have a flag that indicates the user wants to use UIDs, for example:
        cf restart-app-instance --with-instance-uid app-name abc123
        
      • fall back to UIDs if the provided index is not a valid integer (although we'd need to verify that UIDs from Kubernetes are never valid integers) Edit: I've checked and UIDs can definitely be valid integers ☹️ see comment below!

@gcapizzi
Copy link
Contributor Author

gcapizzi commented May 17, 2021

although we'd need to verify that UIDs from Kubernetes are never valid integers

Ok, I've done a little bit of investigation and found out that Kubernetes uses this code to calculate the random UIDs:

// String generates a random alphanumeric string, without vowels, which is n
// characters long.  This will panic if n is less than zero.
// How the random string is created:
// - we generate random int63's
// - from each int63, we are extracting multiple random letters by bit-shifting and masking
// - if some index is out of range of alphanums we neglect it (unlikely to happen multiple times in a row)
func String(n int) string {
	b := make([]byte, n)
	rng.Lock()
	defer rng.Unlock()

	randomInt63 := rng.rand.Int63()
	remaining := maxAlphanumsPerInt
	for i := 0; i < n; {
		if remaining == 0 {
			randomInt63, remaining = rng.rand.Int63(), maxAlphanumsPerInt
		}
		if idx := int(randomInt63 & alphanumsIdxMask); idx < len(alphanums) {
			b[i] = alphanums[idx]
			i++
		}
		randomInt63 >>= alphanumsIdxBits
		remaining--
	}
	return string(b)
}

The String function gets invoked using a constant equal to 5, so UIDs can never be longer than 5 characters.

I've built this little test program:

func main() {
    nStrings := 0
    nInts := 0

    for {
        str := String(5)
        nStrings += 1
        _, err := strconv.ParseUint(str, 10, 64)
        if err == nil {
            nInts += 1
        }

        if nStrings%1000000 == 0 {
            fmt.Printf("strings: %d, ints: %d, ratio: %f%%\n", nStrings, nInts, float64(nInts)/float64(nStrings)*100)
        }
    }
}

and the output looks a bit like this:

...
strings: 298000000, ints: 349270, ratio: 0.117205%
strings: 299000000, ints: 350385, ratio: 0.117186%
strings: 300000000, ints: 351515, ratio: 0.117172%
...

So yes, there's a 0.12% chance that a UID is a valid integer 😕

Edit: this could have been calculated on paper!

  • Number of possible strings of length 5 using the 27 available characters in alphanums = 27 ^ 5 = 14348907
  • Number of possible strings of length 5 using the 7 available digits in alphanums = 7 ^ 5 = 16807
  • 16807 / 14348907 * 100 = 0.11713087275567402

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants