-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor MCM to leverage controller-runtime #724
Comments
cc @unmarshall |
@himanshu-kun You have mentioned internal references in the public. Please check. |
@himanshu-kun You have mentioned internal references in the public. Please check. |
@himanshu-kun You have mentioned internal references in the public. Please check. |
@himanshu-kun You have mentioned internal references in the public. Please check. |
3 similar comments
@himanshu-kun You have mentioned internal references in the public. Please check. |
@himanshu-kun You have mentioned internal references in the public. Please check. |
@himanshu-kun You have mentioned internal references in the public. Please check. |
How to categorize this issue?
/area quality
/kind enhancement
/priority 2
What would you like to be added:
MCM should be re-written using controller-runtime framework. Currently we use client-go directly . As a part of this we would also remove the in-tree deprecated code #622
Why is this needed:
Some advantages are:
FAQ for controller runtime -> https://github.com/kubernetes-sigs/controller-runtime/blob/v0.12.1/FAQ.md
BrainStorm / Ideas
Small refactoring / changes
unschedulable=true
TODO
EventRecorders
.reconcileClusterNodeKey
is useless and not being called.controlCoreInformerFactory.Core().V1().Secrets()
is far too generic.ValidateMachineClass
. As a corollary, consider optimizingvalidateNodeTemplate
as this is repeated in the flow of reconciling machine objects.addMachineFinalizers
addsMCMFinalizer
instead ofMCFinalizer
.status.FromError(err)
in machine controller does not seem necessary sinceDriver
methods already returns errors as status objects. (No need to convert to string and regex parse and build status again)NotManagedByMCM
. This has an issue for multiple MC controller processes.Description
field mixes arbitrary text with constants such asmachineutils.GetVMStatus
, etc. Consider alternative approach such as use of custom conditions / introducing operation label as separate field, etctriggerCreationFlow
in the MC filepkg/util/provider/machinecontroller/machine.go
has logic to check for Azure specific stale node. Provider specific handling should not percolate to the common library code. Re-think on this while refactoring.Machine.Status.LastOperation
being used in several places as the next operation whenMachine.Status.LastOperation.State
isMachineStateProcessing
. AlsoMachineState
is bad name - it actually meansMachineOperationState
. Instead of computing and recording the next operation, reconciler should compute this based on current conditions. Consider introducing new conditions and leveraging the gardener flow library for conditional execution.Terminating
, if we get stuck in thecontroller.getVmStatus
under the cases:case codes.Unknown, codes.DeadlineExceeded, codes.Aborted, codes.Unavailable:
, then we keep repeatinggetVMStatus
with a short retry. We should prefer exponential backoff here.controller.getVMStatus
consider failing permanently forUnimplemented
or error in decoding statuses returned by Driver instead of just queuing again. Check all theDriver.GetMachineStatus
implementations for providers before doing this.Driver
interface for providers and enhanceGetMachineStatusResponse
,ListMachinesResponse
to get the status of each individual resource of a machine and not just the current barebonesproviderId
andnodeName
. This will permit us to differentiate between when there are issues with the VM and issues in VM resource creation.MachineClass
that will do the validation before persisting the object to etcd. Majority of the validations can be done inside the webhook. Gardener extensions has validating webhooks today but they currently do not validate the ProviderConfig. Current view is that we should not link our webhooks to extension webhooks as they will directly tie us to g/g. To support use cases where customers can choose to use MCM independent of g/g shoot validation (done in extension webhooks) should not be used. Instead MCM should create its own validating webhooks which are invoked in addition to the ones defined in gardener extensions.The text was updated successfully, but these errors were encountered: