Releases: admiraltyio/admiralty
Releases · admiraltyio/admiralty
v0.8.2
v0.8.2
Note: we're skipping v0.8.1 because the 0.8.1 image tag was erroneously used for a pre-release version.
Bugfixes
- Fix #20. Scheduling was failing altogether if the namespace didn't exist in one of the target clusters. That cluster is now simply filtered out.
- Fix #21. The feedback controller wasn't compatible with namespaced targets. It was trying to watch remote pod chaperons at the cluster level, which wasn't allowed by remote RBAC.
- Fix #25. The Helm chart values structure was broken, making it difficult to set resource requests/limits.
- Fix #26. Init containers weren't stripped from their service account token volume mounts as pods were delegated.
- Fix a race condition that allowed candidate pod chaperons and their pods to be orphaned if scheduling failed. Finalizer is now added to proxy pods at admission vs. asynchronously by the feedback controller.
Breaking Changes
- Some Helm chart values were broken (see above). As we fixed them, we reorganized all values, so some values that did work now work differently.
v0.8.0
This release removes the central scheduler, replaced by a decentralized algorithm creating candidate pods in all targets (of which only one becomes the proxy pod's delegate). See the proposal for details.
New Features
- Advanced scheduling: all Kubernetes standard scheduling constraints are now respected, not just node selectors, but affinities, etc., because the candidate scheduler uses the Kubernetes scheduling framework.
- There's now one virtual node per target, making it possible to drain clusters, e.g., for blue/green cluster upgrades.
- Delegate pods are now controlled by intermediate pod chaperons. This was a technical requirement of the new scheduling algorithm, with the added benefit that if a delegate pod dies (e.g., is evicted) while its cluster is offline, a new pod will be created to replace it.
Breaking Changes
- Invitations have been removed. The user is responsible for creating service accounts for sources in target clusters. Only the
multicluster-scheduler-source
cluster role is provided, which can be bound to service accounts with cluster-scoped or namespaced role bindings. - You can no longer enforce placement with the
multicluster.admiralty.io/clustername
annotation. Use a more idiomatic node selector instead.
Internals
- We've started internalizing controller runtime logic (for the new feedback and pod chaperon controllers), progressively decoupling multicluster-scheduler from controller-runtime and multicluster-controller, to be more agile.
v0.7.0
This release simplifies the design of multicluster-scheduler to enable new features: removes observations and decisions (and all the finalizers that came with them), replaces federations (two-way sharing) by invitations (one-way sharing, namespaced), adds support for node selectors.
Breaking Changes
- Because the scheduler now watches and creates resources in the member clusters directly, rather than via observations and decisions, those clusters' Kubernetes APIs must be routable from the scheduler. If your clusters are private, please file an issue. We could fix that with tunnels.
- The multi-federation feature was removed and replaced by invitations.
- "Cluster namespaces" (namespaces in the scheduler's cluster that held observations and decisions) are now irrelevant.
New Features
- Support for standard Kubernetes node selectors. If a multi-cluster pod has a node selector, multicluster-scheduler will ensure that the target cluster has nodes that match the selector.
- Invitations. Each agent specifies which clusters are allowed to create/update/delete delegate pods and services in its cluster and optionally in which namespaces. The scheduler respects invitations in its decisions, and isn't even authorized to bypass them.
Bugfixes
- Fix #17. The agent pod could not be restarted because it had a finalizer AND was responsible for removing its own finalizer. The agent doesn't have a finalizer anymore (because we got rid of observations).
- Fix #16. Proxy pods were stuck in
Terminating
phase. That's because they had a non-zero graceful deletion period. On a normal node, it would be the kubelet's responsibility to delete them with a period of 0 once all containers are stopped. On a virtual-kubelet node, it is the implementation's responsibility to do that. Our solution is to mutate the graceful deletion period of proxy pods to 0 at admission, because the cross-cluster garbage collection pattern with finalizers is sufficient to ensure that they are deleted only after their delegates are deleted (and the delegates' containers are stopped).
v0.6.0
New Features
- Multicluster-scheduler is now a virtual-kubelet provider. Proxy pods are scheduled to a virtual node rather than actual nodes:
- Their containers no longer need to be replaced by dummy signal traps, because they aren't run locally.
- A proxy pod's status is simply the copy of its delegate's status. Therefore, it appears pending as long as its delegate is pending (fixes #7).
- Finally, proxy pods no longer count toward the pod limit of actual nodes.
- Merge
cmd/agent
andcmd/webhook
, and run virtual-kubelet as part of the same process. This reduces the number of Kubernetes deployments and Helm subcharts. If we need to scale them independently in the future, we can easily split them again.
Bugfixes
- Add missing post-delete Helm hook for scheduler resources, to delete finalizers on pod and service observations and decisions.
Internals
- Upgrade controller-runtime to v0.4.0, because v0.1.12 wasn't compatible with virtual-kubelet (their dependencies weren't). As a result, cert-manager is now a pre-requisite, because certificate generation for webhooks has been removed as of controller-runtime v0.2.0.
v0.5.0
New Features
- Sensible defaults to make configuration easier:
agent.remotes[0].secretName=remote
in Helm chart- clusterNamespace defaults to cluster name in Helm chart and
pkg/config
Bugfixes
- Fix #13: Post-delete hook job ensures finalizers used for cross-cluster garbage collection are removed when multicluster-scheduler is uninstalled. Previously, those finalizers had to be removed by the user.
Internals
- Align chart (and subcharts) versioning with main project for simplicity, because they are released together.
Documentation
v0.4.0
New Features
- Helm charts (v3) for an easier and more flexible installation
- Multi-federation that works!
- Better RBAC with cluster namespaces: as an option, you can setup multicluster-scheduler so that each member cluster has a dedicated namespace in the scheduler cluster for observations and decisions. This makes it possible for partially trusted clusters to participate in the same federation (they can send pods to one another, via the scheduler, but they cannot observe one another).
- More observations (to support non-basic schedulers)
Bugfixes
- Don't spam the log with update errors like "the object has been modified; please apply your changes to the latest version and try again". The controller would back off and retry, so the log was confusing. We now just ignore the error and let the cache enqueue a reconcile request when it receives the latest version (no back-off, but for that, the controller must watch the updated resource).
Internals
- Use
gc
pattern from multicluster-controller for cross-cluster/cross-namespace garbage collection with finalizers forsend
,receive
, andbind
controllers. (As a result, the localEnqueueRequestForMulticlusterController
handler inreceive
was deleted. It now exists in multicluster-controller.) - Split
bind
controller fromschedule
controller, to more easily plug in custom schedulers. send
controller usesunstructured
to support more observations.- Switch to Go modules
- Faster end-to-end tests with KIND (Kubernetes in Docker) rather than GKE.
- Stop using skaffold (build images once in
build.sh
) and kustomize (because we now use Helm). - Split
delegatestate
controller (in scheduler manager) fromfeedback
controller (in agent manager) to make cluster namespace feature possible (where cluster1 cannot see observations from cluster2). - The source cluster name of an observation is either its namespace (in cluster namespace mode) or the cross-cluster GC label
multicluster.admiralty.io/parent-clusterName
. The target cluster name of a decision is either its namespace (in cluster namespace mode) or themulticluster.admiralty.io/clustername
annotation (added bybind
andglobalsvc
).status.liveState.metadata.ClusterName
is no longer used, except when it's backfilled in memory at the interface with the basic scheduler, which still uses the field internally.
v0.3.0
Multicluster-Scheduler 0.3 Adds Integration With Cilium Cluster Mesh and Global Services.
Breaking Changes:
- pod admission controller operates only in namespaces labeled with
multicluster-scheduler=enabled
Improvements:
- d277a16 If a service's endpoints contains proxy pods, prefix its label selector keys with
multicluster.admiralty.io/
(to match the delegate pods instead) and annotate it withio.cilium/global-service=true
. - d277a16 If a service is a Cilium global service, copy it to other clusters, via the scheduler, observations, and decisions.
- f97cda6 add singular and short aliases for observations and decisions categories, e.g.,
kubectl get obs
gets all kinds of observations
Fixes: