This tutorial is specifically for developers with Java background who want to learn how to write first kubernetes operator fast. Why operators? There are several advantages:
I will try to limit theory to minimum and show a fool-proof recipe how to “bake a cake”. I chose Java because it is close to my working experience and to be honest it is easier than Go (but some may disagree).
Lets jump straight to it.
Nobody likes reading lengthy documentation, but let’s get this quickly of our chest, shall we?
What is a pod?
Pod is a group of containers with shared network interface (and given unique IP address) and storage.
What is a replica set?
Replica set controls creation and deleting of pods so that at each instant there is exactly specified number of pods with given template.
What is deployment?
Deployment owns replica set and indirectly owns pods. When you create deployment pods are created, when you delete it pods are gone.
What is service?
Service is SINGLE internet endpoint for bunch of pods (it distributes the load among them equally). You can expose it to be visible from outside the cluster. It automates the creation of endpoint slices.
The problem with kubernetes is that from the inception it was designed to be stateless. Replica sets don’t track the identities of pods, when particular pod is gone new one is just created. There are some use cases that need state like databases and cache clusters. Stateful sets only partially mitigate the problem.
This is why people started writing operators to take off the burden of maintenance. I wont go into the depths of the pattern and various sdks — you can start from here.
Everything that works in kubernetes, every tiny gear of machinery is based on simple concept of control loop. So what this control loop does for particular resource type is that it checks what is and what should be (as defined in manifest). If there is mismatch it tries to perform some actions to fix that. This is called reconcillation.
And what operators really are is the same concept but for custom resources. Custom resources are the means of extending kubernetes api to some resource types that are defined by you. If you set up crd in kubernetes then all the actions like get, list, update, delete and so on will be possible on this resource. And what will do the actual work? Thats right — our operator.
As typical for testing technology for the first time you pick the problem that is most basic to do. Because the concept is particularly complex then hello world in this case will be a little bit long. Anyways, in most of the sources I have seen that the simplest use case is setting up static page serving.
So the project is like this : we will define custom resource that represents two pages we want to serve. After applying that resource operator will automatically set up serving application in Spring Boot, create config map with pages content, mount the config map into a volume in apps pod and set up service for that pod. What is fun about this is that if we modify the resource, it will rebind everything on the fly and new page changes will be instantly visible. Second fun thing is that if we delete the resource it will delete everything leaving our cluster clean.
Serving java app
This will be really simple static page server in Spring Boot. You will only need spring-boot-starter-web so go ahead to spring initializer and pick:
The app is just this:
@SpringBootApplication @RestController public class WebpageServingApplication { @GetMapping(value = "/{page}", produces = "text/html") public String page(@PathVariable String page) throws IOException { return Files.readString(Path.of("/static/" page)); } public static void main(String[] args) { SpringApplication.run(WebpageServingApplication.class, args); } }
Whatever we pass as path variable will be fetched from /static directory (in our case page1 and page2). So static directory will be mounted from config map, but about that later.
So now we have to build a native image and push it to the remote repository.
Tip number 1
org.graalvm.buildtools native-maven-plugin -Ob
Configuring GraalVM like so you will have fastest build with lowest memory consumption (around 2GB). For me it was a must as I only have 16GB of memory and lots of stuff installed.
Tip number 2
org.springframework.boot spring-boot-maven-plugin true paketobuildpacks/builder-jammy-full:latest ghcr.io/dgawlik/webpage-serving:1.0.5 21 https://ghcr.io/dgawlik dgawlik ${env.GITHUB_TOKEN}
So now you do:
mvn spring-boot:build-image
And that’s it.
Operator with Fabric8
Now the fun starts. First you will need this in your pom:
io.fabric8 kubernetes-client 6.13.4 io.fabric8 crd-generator-apt 6.13.4 provided
crd-generator-apt is a plugin that scans a project, detects CRD pojos and generates the manifest.
Since I mentioned it, these resources are:
@Group("com.github.webserving") @Version("v1alpha1") @ShortNames("websrv") public class WebServingResource extends CustomResourceimplements Namespaced { }
public record WebServingSpec(String page1, String page2) { }
public record WebServingStatus (String status) { }
What is common in all resource manifests in Kubernetes is that most of them has spec and status. So you can see that the spec will consist of two pages pasted in heredoc format. Now, the proper way to handle things would be to manipulate status to reflect whatever operator is doing. If for example it is waiting on deployment to finish it would have status = “Processing”, on everything done it would patch the status to “Ready” and so on. But we will skip that because this is just simple demo.
Good news is that the logic of the operator is all in main class and really short. So step by step here it is:
KubernetesClient client = new KubernetesClientBuilder() .withTaskExecutor(executor).build(); var crdClient = client.resources(WebServingResource.class) .inNamespace("default"); var handler = new GenericResourceEventHandler(update -> { synchronized (changes) { changes.notifyAll(); } }); crdClient.inform(handler).start(); client.apps().deployments().inNamespace("default") .withName("web-serving-app-deployment").inform(handler).start(); client.services().inNamespace("default") .withName("web-serving-app-svc").inform(handler).start(); client.configMaps().inNamespace("default") .withName("web-serving-app-config").inform(handler).start();
So the heart of the program is of course Fabric8 Kuberenetes client built in first line. It is convenient to customize it with own executor. I used famous virtual threads, so when waiting on blocking io java will suspend the logic and move to main.
How here is a new part. The most basic version would be to run forever the loop and put Thread.sleep(1000) in it or so. But there is more clever way - kubernetes informers. Informer is websocket connection to kubernetes api server and it informs the client each time the subscribed resource changes. There is more to it you can read on the internet for example how to use various caches which fetch updates all at once in batch. But here it just subscribes directly per resource. The handler is a little bit bloated so I wrote a helper class GenericResourceEventHandler.
public class GenericResourceEventHandlerimplements ResourceEventHandler { private final Consumer handler; public GenericResourceEventHandler(Consumer handler) { this.handler = handler; } @Override public void onAdd(T obj) { this.handler.accept(obj); } @Override public void onUpdate(T oldObj, T newObj) { this.handler.accept(newObj); } @Override public void onDelete(T obj, boolean deletedFinalStateUnknown) { this.handler.accept(null); } }
Since we only need to wake up the loop in all of the cases then we pass it a generic lambda. The idea for the loop is to wait on lock in the end and then the informer callback releases the lock each time the changes are detected.
Next:
for (; ; ) { var crdList = crdClient.list().getItems(); var crd = Optional.ofNullable(crdList.isEmpty() ? null : crdList.get(0)); var skipUpdate = false; var reload = false; if (!crd.isPresent()) { System.out.println("No WebServingResource found, reconciling disabled"); currentCrd = null; skipUpdate = true; } else if (!crd.get().getSpec().equals( Optional.ofNullable(currentCrd) .map(WebServingResource::getSpec).orElse(null))) { currentCrd = crd.orElse(null); System.out.println("Crd changed, Reconciling ConfigMap"); reload = true; }
If there is no crd then there is nothing to be done. And if the crd changed then we have to reload everything.
var currentConfigMap = client.configMaps().inNamespace("default") .withName("web-serving-app-config").get(); if(!skipUpdate && (reload || desiredConfigMap(currentCrd).equals(currentConfigMap))) { System.out.println("New configmap, reconciling WebServingResource"); client.configMaps().inNamespace("default").withName("web-serving-app-config") .createOrReplace(desiredConfigMap(currentCrd)); reload = true; }
This is for the case that ConfigMap is changed in between the iterations. Since it is mounted in pod then we have to reload the deployment.
var currentServingDeploymentNullable = client.apps().deployments().inNamespace("default") .withName("web-serving-app-deployment").get(); var currentServingDeployment = Optional.ofNullable(currentServingDeploymentNullable); if(!skipUpdate && (reload || !desiredWebServingDeployment(currentCrd).getSpec().equals( currentServingDeployment.map(Deployment::getSpec).orElse(null)))) { System.out.println("Reconciling Deployment"); client.apps().deployments().inNamespace("default").withName("web-serving-app-deployment") .createOrReplace(desiredWebServingDeployment(currentCrd)); } var currentServingServiceNullable = client.services().inNamespace("default") .withName("web-serving-app-svc").get(); var currentServingService = Optional.ofNullable(currentServingServiceNullable); if(!skipUpdate && (reload || !desiredWebServingService(currentCrd).getSpec().equals( currentServingService.map(Service::getSpec).orElse(null)))) { System.out.println("Reconciling Service"); client.services().inNamespace("default").withName("web-serving-app-svc") .createOrReplace(desiredWebServingService(currentCrd)); }
If any of the service or deployment differs from the defaults we will replace them with the defaults.
synchronized (changes) { changes.wait(); }
Then the aforementioned lock.
So now the only thing is to define the desired configmap, service and deployment.
private static Deployment desiredWebServingDeployment(WebServingResource crd) { return new DeploymentBuilder() .withNewMetadata() .withName("web-serving-app-deployment") .withNamespace("default") .addToLabels("app", "web-serving-app") .withOwnerReferences(createOwnerReference(crd)) .endMetadata() .withNewSpec() .withReplicas(1) .withNewSelector() .addToMatchLabels("app", "web-serving-app") .endSelector() .withNewTemplate() .withNewMetadata() .addToLabels("app", "web-serving-app") .endMetadata() .withNewSpec() .addNewContainer() .withName("web-serving-app-container") .withImage("ghcr.io/dgawlik/webpage-serving:1.0.5") .withVolumeMounts(new VolumeMountBuilder() .withName("web-serving-app-config") .withMountPath("/static") .build()) .addNewPort() .withContainerPort(8080) .endPort() .endContainer() .withVolumes(new VolumeBuilder() .withName("web-serving-app-config") .withConfigMap(new ConfigMapVolumeSourceBuilder() .withName("web-serving-app-config") .build()) .build()) .withImagePullSecrets(new LocalObjectReferenceBuilder() .withName("regcred").build()) .endSpec() .endTemplate() .endSpec() .build(); } private static Service desiredWebServingService(WebServingResource crd) { return new ServiceBuilder() .editMetadata() .withName("web-serving-app-svc") .withOwnerReferences(createOwnerReference(crd)) .withNamespace(crd.getMetadata().getNamespace()) .endMetadata() .editSpec() .addNewPort() .withPort(8080) .withTargetPort(new IntOrString(8080)) .endPort() .addToSelector("app", "web-serving-app") .endSpec() .build(); } private static ConfigMap desiredConfigMap(WebServingResource crd) { return new ConfigMapBuilder() .withMetadata( new ObjectMetaBuilder() .withName("web-serving-app-config") .withNamespace(crd.getMetadata().getNamespace()) .withOwnerReferences(createOwnerReference(crd)) .build()) .withData(Map.of("page1", crd.getSpec().page1(), "page2", crd.getSpec().page2())) .build(); } private static OwnerReference createOwnerReference(WebServingResource crd) { return new OwnerReferenceBuilder() .withApiVersion(crd.getApiVersion()) .withKind(crd.getKind()) .withName(crd.getMetadata().getName()) .withUid(crd.getMetadata().getUid()) .withController(true) .build(); }
The magic of the OwnerReference is that you mark the resource which is it’s parent. Whenever you delete the parent k8s will delete automatically all the dependant resources.
But you can’t run it yet. You need a docker credentials in kubernetes:
kubectl delete secret regcred kubectl create secret docker-registry regcred \ --docker-server=ghcr.io \ --docker-username=dgawlik \ --docker-password=$GITHUB_TOKEN
Run this script once. Then we also need to set up the ingress:
apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: demo-ingress spec: rules: - http: paths: - path: / pathType: Prefix backend: service: name: web-serving-app-svc port: number: 8080
The workflow
So first you build the operator project. Then you take target/classes/META-INF/fabric8/webservingresources.com.github.webserving-v1.yml and apply it. From now on the kubernetes is ready to accept your crd. Here it is:
apiVersion: com.github.webserving/v1alpha1 kind: WebServingResource metadata: name: example-ws namespace: default spec: page1: |Hola amigos!
Buenos dias!
page2: |Hello my friend
Good evening
You apply the crd kubectl apply -f src/main/resources/crd-instance.yaml. And then you run Main of the operator.
Then monitor the pod if it is up. Next just take the ip of the cluster:
minikube ip
And in your browser navigate to /page1 and /page2.
Then try to change the crd and apply it again. After a second you should see the changes.
The end.
A bright observer will notice that the code has some concurrency issues. A lot can happen in between the start and the end of the loop. But there are a lot of cases to consider and tried to keep it simple. You can do it as aftermath.
Like wise for the deployment. Instead of running it in IDE you can build the image the same way as for serving app and write deployment of it. That’s basically demystification of the operator — it is just a pod like every other.
I hope you found it useful.
Thanks for reading.
I almost forgot - here is the repo:
https://github.com/dgawlik/operator-hello-world
免責事項: 提供されるすべてのリソースの一部はインターネットからのものです。お客様の著作権またはその他の権利および利益の侵害がある場合は、詳細な理由を説明し、著作権または権利および利益の証拠を提出して、電子メール [email protected] に送信してください。 できるだけ早く対応させていただきます。
Copyright© 2022 湘ICP备2022001581号-3