Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Latest commit

 

History

History
272 lines (192 loc) · 11.7 KB

hacking.md

File metadata and controls

272 lines (192 loc) · 11.7 KB

Developer guide

Assumed background: Kubernetes controllers, operator-sdk or similar, custom resource definitions

Developer Setup

This is a Go project. To build and test this operator you also need the operator-sdk.

Use of Go modules, per the Go modules Wiki, requires invoking Go outside of the src folder or setting export GO111MODULE=on.

See the Operator-Framework Community Operators docs for instructions for PR'ing OperatorHub.io with new versions.

Concept map

The AkkaCluster operator is similar in concept to a Deployment controller, in that it watches for a top level resource then drives changes down into sub-resources. So just like a Deployment drives changes into a ReplicaSet, the AkkaCluster drives changes into a Deployment, ServiceAccount, Role, and RoleBinding.

The spec of an AkkaCluster is just a Deployment spec, with a set of defaults that are used if certain fields are blank. On the reconcile main loop, the AkkaCluster resource is turned into an ideal specification for a set of sub-resources, which is the goal state. In-cluster resources are then compared to the goal state, and created or updated as needed.

The status of an AkkaCluster is fed to the controller by a helper loop, run as an actor, which is keeping a list of clusters, their leader endpoints, and their Akka Management endpoint results. If the status actor sees a change, it pushes a reconcile event to the main controller, where the change is picked up.

The above custom logic should hook pretty easily into any operator framework, and there's nothing that particularly requires operator-sdk except where it is used to validate OLM things. The heavy lifting outside of the above logic is all done in Kubernetes client code. Since this area is a fast moving target, the main thought here was just to use the native client libs which happen to be in Go, so this is in go too. But this could all be written in Scala if one wanted to be adventurous.

Code map

The project skeleton was generated by operator-sdk

serde in ./pkg/apis/app/v1alpha1/

akkacluster_types.go is the primary source for serialization / deserialization. It has Go structures with json tags for all the AkkaCluster spec and status bits. If you change things here, you must run "generate k8s" to regenerate all the various DeepCopy functions.

operator-sdk generate k8s

controller in ./pkg/controller/akkacluster/

akkacluster_controller.go is the primary source for the controller, and is where Watch() is called to set up reconcile triggers, and where Reconcile() is defined.

deploy_builder.go takes an AkkaCluster, fills in defaults, returns a set of ideal resources.

subset.go is a generic SubsetEqual implementation, using reflection to support arbitrary Go structures. SubsetEqual(A,B) returns true if A is a subset of B. This is handy for comparing pristine ideal resources with mucked in-cluster resources, ignoring all the ephemera of live resources like timestamps, uids, and other run time housekeeping.

status.go is an actor responsible for polling cluster leaders for Akka Management status. It provides status to Reconcile(), and triggers reconcile when it sees status changes. It bootstraps in a way very similar to Akka Management, in that it uses the AkkaCluster pod selector to list running pods and then starts talking to them to locate the leader.

deployment artifacts

./deploy/*.yaml is the operator Deployment, ServiceAccount, Role, and RoleBinding. These were all generated by operator-sdk, meaning nothing special here, just generic operator things.

./deploy/crds/app.lightbend.com_akkaclusters_crd.yaml has the custom resource definition. This is where new top level fields and basic validation go, if you want kubectl to know a valid from invalid AkkaCluster.

./deploy/olm-catalog/ has a nested OLM package, which should be updated on releases. The primary source here are the "*clusterserviceversion.yaml" files, one for each version published. Certified versions of the CSVs are of the format -certified.*.clusterserviceversion.yaml. Certified container images, including the Operator, are published and distributed separately.

Unit testing

The controller has several unit tests that can be run with the usual go tools.

go test -race ./...

akkacluster_controller_test.go uses controller-runtime client mocks to emulate a kubernetes environment behind the client interface.

deploy_builder_test.go uses a yamlizer and gold files to take input yaml, generate ideal resources, compare to output yaml files. Gold file tests are a kind of regression test, and may need to be updated if the schema changes or deploy builder defaults change.

To update gold files, run tests with the -update flag:

go test -update ./...

subset_test.go has a mix of whitebox, blackbox, edge, and gold file tests. The whitebox tests just say if the subset was found or not. The blackbox tests say how many nodes in the object tree were compared equal, to ensure that the tree is walked completely. Edge tests include various empty comparisons, off-by-one comparisons, and validates that short circuit works on recursive objects. There is one gold file test against a complex Deployment object found in OpenShift, with tons of extraneous fields to ignore.

status_test.go is a step toward property testing, with various generators for AkkaClusters. It also defines a mock urlReader and podLister so the test can provide the status actor with arbitrary cluster results. It then runs a status actor in high speed mode and verifies that status changes are correctly signaled back to the controller.

minikube loop

  • install operator-sdk
  • start minikube
  • install the CRD
  • route pod network to macbook so operator can query akka management endpoints
sudo route -n add 172.17.0.0/16 $(minikube ip)

then loop on:

  • operator-sdk up local

and a demo app in minikube. This lets you run the operator locally, watch logs, watch mutations to resources within minikube.

Local build

While typically you would run locally using operator-sdk up local you can also build the docker image using operator-sdk.

operator-sdk build akkacluster-operator:latest

Note that you'll then need to modify the Deployment to point to your local image.

GitHub Actions

CI/CD is done via GitHub Actions, as seen in ./.github/main.workflow. PRs must pass unit tests, and merges to master trigger goreportcard updates and docker image builds that get pushed to Lightbend registry.

On every Pull Request pull_request.yml

  • execute go test -race ./...

On every push to master push.yml

  • checkout master.
  • build artifact akkacluster-operator:latest
  • publish to Lightbend registry.
  • credentials are available in settings/secrets of the repo.

The ./.github/actions/operator-sdk/ docker image works off a pinned version of operator-sdk so will have to be updated regularly to keep up with changes.

Manual Installation

In order to install versions of the Operator other than those published on Operatorhub, one can:

Install the AkkaCluster Custom Resource Definition (CRD)

For Kubernetes to understand AkkaCluster resources, each cluster must have the CRD installed. This yaml specifies the schema of an AkkaCluster resource, and registers it in the API layer so that kubectl and controllers and other API clients can interact with AkkaCluster resources just like any other.

This CRD must be installed once for each cluster:

kubectl apply -f ./deploy/crds/app.lightbend.com_akkaclusters_crd.yaml

One way to test if this worked is to run kubectl get akkaclusters. This should return an error if the CRD is not yet installed, explaining that this is an unknown resource type.

$ kubectl get akkaclusters
error: the server doesn't have a resource type "akkaclusters"

If the CR is installed, you should see a normal response, like "No resources found."

$ kubectl get akkaclusters
No resources found.

Install the controller

Once the CRD is installed, the run-time controller for that resource must be installed and running for Kubernetes to act on AkkaCluster resources. This controller watches for resource changes related to AkkaClusters, and reconciles what is running with what is wanted just like a Deployment controller.

This must be installed in each namespace where AkkaClusters are expected.

kubectl apply -f ./deploy

If this works, you should see an akka-cluster-operator deployment in the namespace.

OLM manifests

The OperatorHub dev site, https://dev.operatorhub.io/ includes 'beta' CSV generation and validation tooling. The dev.operatorhub.io tooling helps developers manage the manifests.

The OLM manifest is a ClusterServiceVersion (csv) yaml file. Broadly speaking, this is a collection of 1. marketing material describing the catalog web page content 2. installation specification corresponding to a version of the operator. It may be desirable to build a release template for this manifest as it will need to be copied and slightly altered for each release. In practice, it has been easier to regenerate the CSV using the OperatorHub Beta Package tool.

The spec.install section is a kind of copy of the operator Deployment, so changes to the deploy resources will need to be reflected here.

The spec.customresourcedefinitions is a mix of marketing material and validation collateral. The list of resources here correspond to each alm-examples entry, and the description and displayName will be published in the web page Example box. The resources, specDescriptors, and statusDescriptors are used by operator-sdk scorecard, but isn't clear to me how else these are used.

Most of the rest is marketing material and can be previewed here, but the main content is in spec.description which is in markdown format and can include links and images etc. metadata.description is the headline blurb shown in several places, and should be kept quite short.

Release versions will touch several fields, including

  • metadata.annotations.containerImage
  • metadata.name
  • spec.install...image
  • spec.version

OLM testing

For manual preview of the marketing material as it might appear on OperatorHub, you can use https://operatorhub.io/preview

Then to validate the OLM manifest without running a Kubernetes cluster, one can use operator-courier like so:

operator-courier --verbose verify ./deploy/olm-catalog/akka-cluster-operator

Given a working Kubernetes cluster, including minikube, one can run the scorecard tests. With operator-sdk version v0.14.1, a config file is used to run scorecard, .osdk-scorecard.yaml. Enable Go 1.11 modules before running.

export GO111MODULE=on
operator-sdk scorecard