-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: bump training-operator v1.7 -> v1.8 #162
chore: bump training-operator v1.7 -> v1.8 #162
Conversation
This commit refactors the way the training-operator is deployed, as instead of using a sidecar container that runs the workload, we are now applying the Deployment and all the Kubernetes resources required by the training-operator controller to be able to mange training resources. We are introducing this change in preparation for the upcoming 1.8 version, as it introduces the hard dependency on a Kubernetes Secret mounted in a volume for the training-operator workload to start. For more details please refer to #159.
This commit introduces the following changes: * The charm now renders and applies a ValidatingWebhookConfiguration resource for training-operator CRDs. * The charm will now patch the Service to also serve on port 9443 for the webhook service. * The oci-image is updated to v1.8 of the training-operator * The training-operator Deployment now has a volume mount for mounting the secret that is used by the cert-controller to generate and rotate certificates for the ValidatingWebhookConfiguration Fixes #159
* pin integration test dependencies, refactor constants in tests (#155) Pins dependencies in the integration tests to their corresponding channels for this release. Ref: canonical/bundle-kubeflow#866 Co-authored-by: Andrew Scribner <[email protected]>
* refactor: deploy the training-operator with kubernetes resources This commit refactors the way the training-operator is deployed, as instead of using a sidecar container that runs the workload, we are now applying the Deployment and all the Kubernetes resources required by the training-operator controller to be able to mange training resources. We are introducing this change in preparation for the upcoming 1.8 version, as it introduces the hard dependency on a Kubernetes Secret mounted in a volume for the training-operator workload to start. For more details please refer to #159.
…rainig-operator-1.8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @DnPlas, leaving some comments plus IIUC the following changes are missing:
- image +
secretGenerator
field https://github.com/kubeflow/manifests/pull/2699/files#diff-23b73c27a3cd384ed58c680eac804ff86be38f48688cc94742c51c530e4164e8 - changes in aggregation roles https://github.com/kubeflow/manifests/pull/2699/files#diff-8b80f589b6814910a625e07ed23ff511ef76e81cf009dad169eba0f26ed04497
- changes in training-operator role https://github.com/kubeflow/manifests/pull/2699/files#diff-a88903de8c7c015c3e145b77718f9cee059aedfc24e3b0f37d246f979a4bb15b
I have updated the ClusterRole and aggregated rules, as well as the image tag, but the |
5bb0371
to
32ecee6
Compare
kubeflow/training-operator#2143 could be affecting (intermittently) the CI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments
Please review #171 before this, it should help working around the CI issue. |
This PR is now affected by #174 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx @DnPlas, almost there
We need to update the yamls in /examples
, I see some images for the examples were updated in the training-opeartor 1.8 branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This commit introduces the following changes: * The charm now renders and applies a ValidatingWebhookConfiguration resource for training-operator CRDs. * The charm will render the Service to also serve on port 9443 for the webhook service. * The oci-image is updated to v1.8 of the training-operator * The training-operator Deployment now has a volume mount for mounting the secret that is used by the cert-controller to generate and rotate certificates for the ValidatingWebhookConfiguration * The training-operator Deployment will now take an argument so the webhook service can use the training-operator workload's Service instead of the default * Updates the examples directory with examples from kubeflow/training-operator v1.8-branch Fixes #159
…` for workload, also bumps training-operator 1.7->1.8 (#167) * pin integration test dependencies, refactor constants in tests (#164) * refactor: deploy the training-operator with kubernetes resources (#161) * chore: bump training-operator v1.7 -> v1.8 (#162) * refactor: apply a workload Service instead of using juju created one (#173) * tests: skip test_upgrade due to #170 (#171) * build, tests: bump charmed-kubeflow-chisme 0.4.0 -> 0.4.1 (#172) Fixes #159
This commit introduces the following changes:
training-operator CRDs.
the secret that is used by the cert-controller to generate and rotate
certificates for the ValidatingWebhookConfiguration
Fixes #159