Fix creation of AWs/Jobs with same name in different namespaces #652

ChristianZaccaria · 2023-09-29T12:56:45Z

Issue link

Resolves #433
And resolves #383
And closes #671

In the genericresource.go file there is a SyncQueueJob function which is used to create resources inside etcd. In this function, the identification of AppWrappers/Jobs was primarily based on a label (resourceName) which had a value of the AW name without taking into account the namespace. As a result, when 2 AWs/Jobs with the same name were created in different namespaces, the 2nd AW/Job was identified as a duplicate.

What changes have been made

Updated labels from appwrapper.mcad.ibm.com to workload.codeflare.dev/appwrapper, and from resourceName to workload.codeflare.dev/resourceName.
Created helper function GetRandomString() to truncate the name of objects in etcd at 60 characters instead of 63, to allow to append 3 random alphanumeric (lowercased) characters to avoid unexpected clashes when using similar-but-ending-differently long names.
Label selectors now take into account the AppWrapper namespace and not just the name.
Updated quotaManagement.rbac.apiGroup from ibm.com to quota.codeflare.dev

Verification steps

Have MCAD running.
Create two AWs/Jobs with the same name but in different namespaces.
See their pods created respectively, and running.

Checks

I've made sure the tests are passing.
Testing Strategy
- e2e test
- Manual tests

openshift-ci · 2023-09-29T12:56:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign asm582 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

kpouget · 2023-09-29T13:18:23Z

pkg/controller/queuejobresources/genericresource/genericresource.go

+		if len(newName + namespace) > 63 {
+			newName = newName[:len(newName) - (len(newName) + len(namespace) - 63)]


at quick glance, I'm not 100% sure about the correctness of these lines.
but actually, even the original code:

if len(newName) > 63 { newName = newName[:63]

makes me doubtful. Do we really want to truncate the name at 64 chars?
wouldn't it lead to unexpected clashes when using similar-but-ending-differently long names?

Raising an error could be saner (and thus making the AppWrapper impossible to process), it's a common K8s thing to abort on long names.

I have tried in the past to concatenate the namespace and name of a job and observed that this would indeed run long for many of our jobs. The use of the hyphen is also problematic because it is ambiguous. We should have separate labels for the name and namespace.

Thank you @kpouget, you raised a very good point. For what I could read, it seems that it is done to meet with DNS standards RFC 1035/1123. Since Kubernetes namespaces and names are often used to form a DNS subdomain, K8s enforces it here.

After looking at this code in particular again, I can confirm that we can keep the original code here as the namespace is used in the creation of the object anyways.

Now, to deal with very long names, we could raise an error and stop the operation as suggested. But I was thinking what if we handle the error this way? Here are some ideas:

Hashing the name for uniqueness?

Or, Truncate to 60 characters and append i.e., 3 random numbers?

What do you think?

Thanks @tardieu, having separate labels for the name and namespace does look like the best option instead of using a hyphen.

Now, to deal with very long names, we could raise an error and stop the operation as suggested. But I was thinking what if we handle the error this way?
do you know how this DNS name is used? is it something that the AppWrapper owner will ever want to use?
if it's purely internal, then using a hash, or truncating the name + random chars seems to be a good idea

astefanutti · 2023-09-29T13:18:21Z

pkg/controller/queuejobresources/genericresource/genericresource.go

@@ -163,7 +163,7 @@ func (gr *GenericResources) Cleanup(aw *arbv1.AppWrapper, awr *arbv1.AppWrapperG
 	}

 	// Get the resource to see if it exists
-	labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobName, aw.Name, resourceName, unstruct.GetName())
+	labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName())


Suggested change

labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName())

labelSelector := fmt.Sprintf("%s=%s, %s=%s-%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace(), unstruct.GetName())

I don't why I didn't think of that, thanks @astefanutti :)

astefanutti · 2023-09-29T13:40:42Z

pkg/controller/queuejobresources/genericresource/genericresource.go

@@ -163,7 +163,7 @@ func (gr *GenericResources) Cleanup(aw *arbv1.AppWrapper, awr *arbv1.AppWrapperG
 	}

 	// Get the resource to see if it exists
-	labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobName, aw.Name, resourceName, unstruct.GetName())
+	labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName())


Can that label be removed altogether? It'll avoid those cluster-wide list queries, and be fully deterministic. Because even with that fix, you could imagine case when labels clash.

For normal cleanup of eg an appwrapper containing a pytorchjob, we should only delete the pytorch job and let the training operator do the rest. But we are also exploring adding "forceful" cleanup (kill -9) as a last resort where we delete (forcefully) any resource that can be traced back to an appwrapper. For the latter at least, it makes sense to select on labels.

That would assume the underlying operators propagate the label, right? It seems owner references and cascading deletion would be a more robust and generally supported mechanism. Granted it would restrict the generic items to be within the same namespace as that of the AppWrapper (cross namespace ownership is not allowed), but even that may possibly be seen as a good restriction.

This indeed assumes that wrapped resources in the appwrapper are properly labelled and that labels are propagated by operators for resources indirectly created. Again, this is only a last resort, but an important last resort. This definitely does not preclude trying to do the right thing first. Ownership references are not an option in general due to the cross namespace restriction. Transitive ownership and cascading deletion is good practice but there is no guarantee operators follow these practices.

For more context, we do experience node failures scenarios where the kube scheduler hence typically operators fail to terminate pods. The pods remain in terminating state until forcefully deleted (irrespective of grace periods), hence the need to orchestrate forceful deletion in MCAD.

To be clear, even for forceful deletion, I am not advocating for relying solely on labels, but labels are one of the tools we want to use to scope the forceful deletion.

tardieu · 2023-09-29T15:22:22Z

Do we have any plans to change the label name from appwrapper.mcad.ibm.com to something consistent with the group name, maybe appwrapper.worload.codeflare.dev?

astefanutti · 2023-09-29T15:32:13Z

Do we have any plans to change the label name from appwrapper.mcad.ibm.com to something consistent with the group name, maybe appwrapper.worload.codeflare.dev?

Yes, I agree it'll have to be changed, e.g., to worload.codeflare.dev/appwrapper. @tardieu is this the only purpose of that label? If yes, @ChristianZaccaria maybe you could tackle that change in the scope of that PR, WDYT?

ChristianZaccaria · 2023-09-29T15:40:19Z

Do we have any plans to change the label name from appwrapper.mcad.ibm.com to something consistent with the group name, maybe appwrapper.worload.codeflare.dev?

Yes, I agree it'll have to be changed, e.g., to worload.codeflare.dev/appwrapper. @tardieu is this the only purpose of that label? If yes, @ChristianZaccaria maybe you could tackle that change in the scope of that PR, WDYT?

@astefanutti Yes, sounds good to me. It's a good idea to include the change here.

tardieu · 2023-09-29T16:15:52Z

@tardieu is this the only purpose of that label?

The label is used to select resources directly created by MCAD (in genericresource.go) during creation, deletion, and to decide if the job is complete. As discussed, this is probably redundant with just querying for the generic items one by one.

The same label is used to select pods directly or indirectly caused by an appwrapper (in queuejob_controller_ex.go) to look at pod statuses and decide if the job is healthy or not.

The change needs to be consistent across. The introduction of a separate label for the namespace should also be done consistently across. Otherwise, we will end up confusing pods resulting from appwrapper mynamespace/myname with the pods of appwrapper myothernamespace/myname.

This does mean that all tests and all appwrapper yamls in circulation have to be adjusted. I would rather sync this change with the apigroup change than have users go back and forth between 3 CRD revisions (old group + old label, new group + old label, new group + new label).

tardieu · 2023-10-04T16:41:27Z

If I read this correctly, the latest push adds a label with the namespace of the wrapped resource itself rather than the appwrapper namespace. I am not sure I understand the logic.

tardieu · 2023-10-04T16:48:12Z

We should probably avoid unqualified label names like resourceName and replace them with names such as workload.codeflare.dev/resourceName.

ChristianZaccaria · 2023-10-05T09:50:50Z

If I read this correctly, the latest push adds a label with the namespace of the wrapped resource itself rather than the appwrapper namespace. I am not sure I understand the logic.

I think I understand your point. Would it make more sense to include labels for both wrapped resources and appwrappers? or just use the appwrapper name and namespace? I think that the issue with this is that i.e., if two jobs are created with same name and different namespaces, the 2nd job's pods are not created, so it's not originally the appwrapper where the issue lies. Let me know if I misunderstand, thank you Olivier

ChristianZaccaria · 2023-11-14T15:38:30Z

@astefanutti Hi Antonin, I'm wondering if you think updating the quotaManagement.rbac.apiGroup like this is correct? From ibm.com to quota.codeflare.dev. Thanks!

multi-cluster-app-dispatcher/test/kuttl-test-borrowing.yaml

Line 8 in bbb6871

    
           - script: helm upgrade  --install mcad-controller deployment/mcad-controller --namespace kube-system --wait --set loglevel=${LOG_LEVEL} --set resources.requests.cpu=1000m --set resources.requests.memory=1024Mi --set resources.limits.cpu=4000m --set resources.limits.memory=4096Mi --set image.repository=$IMAGE_REPOSITORY_MCAD --set image.tag=$IMAGE_TAG_MCAD --set image.pullPolicy=$MCAD_IMAGE_PULL_POLICY --set configMap.quotaEnabled='"true"' --set quotaManagement.rbac.apiGroup=ibm.com --set quotaManagement.rbac.resource=quotasubtrees  --set configMap.name=mcad-controller-configmap --set configMap.preemptionEnabled='"true"'

ChristianZaccaria · 2023-11-14T15:42:07Z

I attempted to not use completely the resourceName label but caused the e2e tests to fail. I believe it has to do with certain scenarios where the resourceName differs from the AppWrapper name, such as this example:

multi-cluster-app-dispatcher/test/e2e-kuttl-deployment-01/steps/03-assert.yaml

Lines 27 to 30 in 41833f2

    
           labels:                                                                                                                                                                                 
        
              appwrapper.mcad.ibm.com: hold-completion-job-03                                                                                                                                       
        
              job-name: hold-completion-job-03-01                                                                                                                                                    
        
              resourceName: hold-completion-job-03-01

kpouget · 2023-11-24T17:05:07Z

Hello @ChristianZaccaria , a lot of changes seems to be unrelated, or indirectly at least, to the original issue:

shouldn't these change be merged in another PR? 🤔

ChristianZaccaria · 2023-11-27T10:41:39Z

Hello @ChristianZaccaria , a lot of changes seems to be unrelated, or indirectly at least, to the original issue:

shouldn't these change be merged in another PR? 🤔

Hi @kpouget, it was suggested to include these changes in the scope of this PR: #652 (comment)

KPostOffice · 2024-01-04T22:35:50Z

pkg/controller/queuejobresources/genericresource/genericresource.go

-			newName = newName[:63]
+		if len(newName) > 60 {
+			newName = newName[:60]
+			newName += GetRandomString(3)
 		}

 		err = deleteObject(namespaced, namespace, newName, rsrc, dclient)


I think this will almost always fail for length >60 because the string you create is not deterministic. You should use a hash for getting the last 3 digits

KPostOffice · 2024-01-04T22:37:50Z

pkg/controller/queuejobresources/genericresource/genericresource.go

-			newName = newName[:63]
+		if len(newName) > 60 {
+			newName = newName[:60]
+			newName += GetRandomString(3)


openshift-ci bot requested review from dmatch01 and tardieu September 29, 2023 12:56

ChristianZaccaria requested review from kpouget, astefanutti, metalcycling and z103cb and removed request for dmatch01 and tardieu September 29, 2023 12:57

kpouget reviewed Sep 29, 2023

View reviewed changes

astefanutti reviewed Sep 29, 2023

View reviewed changes

ChristianZaccaria force-pushed the aw-different-namespaces branch 2 times, most recently from 8d5acc1 to 719dea1 Compare October 4, 2023 15:39

tardieu self-requested a review October 4, 2023 16:35

This was referenced Oct 16, 2023

Update RayCluster labels to workload.codeflare.dev/appwrappers project-codeflare/codeflare-sdk#384

Merged

Force creation of generic items into AppWrapper deployment namespace #665

Merged

openshift-merge-robot added the needs-rebase label Oct 22, 2023

ChristianZaccaria added the do-not-merge/work-in-progress label Oct 23, 2023

ChristianZaccaria force-pushed the aw-different-namespaces branch 2 times, most recently from 441402a to cedd7ad Compare October 23, 2023 13:07

openshift-merge-robot removed the needs-rebase label Oct 23, 2023

ChristianZaccaria mentioned this pull request Oct 23, 2023

Evaluate the usage of the resourceName label in generic resources #671

Open

ChristianZaccaria force-pushed the aw-different-namespaces branch from 8a92543 to f3f7858 Compare October 24, 2023 10:33

ChristianZaccaria force-pushed the aw-different-namespaces branch 7 times, most recently from 64d82ba to a24f2f9 Compare November 10, 2023 17:22

ChristianZaccaria force-pushed the aw-different-namespaces branch from a24f2f9 to c7fa546 Compare November 14, 2023 11:24

ChristianZaccaria removed the do-not-merge/work-in-progress label Nov 14, 2023

ChristianZaccaria force-pushed the aw-different-namespaces branch 2 times, most recently from 0693e5d to 85d8617 Compare November 14, 2023 12:24

ChristianZaccaria requested review from kpouget and astefanutti November 24, 2023 16:38

ChristianZaccaria force-pushed the aw-different-namespaces branch 2 times, most recently from 993f665 to 10b2c49 Compare November 28, 2023 12:19

Fix creation of aw/job with same name in different namespaces

008bdd9

ChristianZaccaria force-pushed the aw-different-namespaces branch from 10b2c49 to 008bdd9 Compare November 28, 2023 12:46

KPostOffice reviewed Jan 4, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix creation of AWs/Jobs with same name in different namespaces #652

Fix creation of AWs/Jobs with same name in different namespaces #652

ChristianZaccaria commented Sep 29, 2023 •

edited

Loading

openshift-ci bot commented Sep 29, 2023

kpouget Sep 29, 2023

tardieu Sep 29, 2023

ChristianZaccaria Sep 29, 2023

ChristianZaccaria Sep 29, 2023

kpouget Oct 4, 2023

astefanutti Sep 29, 2023

ChristianZaccaria Sep 29, 2023

astefanutti Sep 29, 2023

tardieu Sep 29, 2023

astefanutti Sep 29, 2023

tardieu Sep 29, 2023

tardieu Sep 29, 2023

tardieu Sep 29, 2023

tardieu commented Sep 29, 2023

astefanutti commented Sep 29, 2023

ChristianZaccaria commented Sep 29, 2023 •

edited

Loading

tardieu commented Sep 29, 2023 •

edited

Loading

tardieu commented Oct 4, 2023

tardieu commented Oct 4, 2023

ChristianZaccaria commented Oct 5, 2023

ChristianZaccaria commented Nov 14, 2023

ChristianZaccaria commented Nov 14, 2023

kpouget commented Nov 24, 2023 •

edited

Loading

ChristianZaccaria commented Nov 27, 2023 •

edited

Loading

KPostOffice Jan 4, 2024

KPostOffice Jan 4, 2024

		if len(newName + namespace) > 63 {
		newName = newName[:len(newName) - (len(newName) + len(namespace) - 63)]

	labelSelector := fmt.Sprintf("%s=%s, %s=%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace() + "-" + unstruct.GetName())
	labelSelector := fmt.Sprintf("%s=%s, %s=%s-%s", appwrapperJobLabelName, aw.Name, resourceName, unstruct.GetNamespace(), unstruct.GetName())

Fix creation of AWs/Jobs with same name in different namespaces #652

Are you sure you want to change the base?

Fix creation of AWs/Jobs with same name in different namespaces #652

Conversation

ChristianZaccaria commented Sep 29, 2023 • edited Loading

Issue link

What changes have been made

Verification steps

Checks

openshift-ci bot commented Sep 29, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tardieu commented Sep 29, 2023

astefanutti commented Sep 29, 2023

ChristianZaccaria commented Sep 29, 2023 • edited Loading

tardieu commented Sep 29, 2023 • edited Loading

tardieu commented Oct 4, 2023

tardieu commented Oct 4, 2023

ChristianZaccaria commented Oct 5, 2023

ChristianZaccaria commented Nov 14, 2023

ChristianZaccaria commented Nov 14, 2023

kpouget commented Nov 24, 2023 • edited Loading

ChristianZaccaria commented Nov 27, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChristianZaccaria commented Sep 29, 2023 •

edited

Loading

ChristianZaccaria commented Sep 29, 2023 •

edited

Loading

tardieu commented Sep 29, 2023 •

edited

Loading

kpouget commented Nov 24, 2023 •

edited

Loading

ChristianZaccaria commented Nov 27, 2023 •

edited

Loading