Config follow-up: Implement Server-Side Apply #472

andreaskaris · 2025-09-03T16:42:13Z

Server-Side Apply

My investigation into Server-Side Apply:
https://andreaskaris.github.io/blog/coding/server-side-apply/

Server-Side Apply simplifies controller logic by using a single approach for both creating and updating resources. Instead of checking if an object exists and then choosing between Create or Update operations, controllers can use the Patch method with client.Apply for all cases.
This approach works particularly well for reconstructive controllers that want to declare the desired state of resources. The controller builds the complete object definition and lets the Kubernetes API server handle the differences. Field management ensures that conflicts are detected and ownership is tracked properly.
https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/#reconstructive-controllers: Reconstructive controllers: "This kind of controller wasn't really possible prior to SSA. The idea here is to (whenever something changes etc) reconstruct from scratch the fields of the object as the controller wishes them to be, and then apply the change to the server, letting it figure out the result. I now recommend that new controllers start out this way–it's less fiddly to say what you want an object to look like than it is to say how you want it to change. (...) To get around this downside, why not GET the object and only send your apply if the object needs it? Surprisingly, it doesn't help much – a no-op apply is not very much more work for the API server than an extra GET; and an apply that changes things is cheaper than that same apply with a preceding GET. Worse, since it is a distributed system, something could change between your GET and apply, invalidating your computation. Instead, you can use this optimization on an object retrieved from a cache–then it legitimately will reduce load on the system (at the cost of a delay when a change is needed and the cache is a bit behind)"

Get requests are cached by the controller-runtime, so we still benefit from the caching mechanism by running r.Get before and by comparing existing to desired. If existing is not found in the cache, or if the cached version of existing != desired, we build an SSA patch and send that to the server.

We continue comparing to and sending our full intent, meaning the full object, via SSA. Partial updates to resources with Server-Side Apply make little sense in my opinion, as we want to declare the full state of each object and we want to catch any deviations.

Status of Server-Side Apply native support in controller-runtime

Native support of Server-Side Apply is still a work in progress, although it seems that they are nearly done.
Up to controller-runtime 0.21.0, r.Apply() was not available, instead, the recommendation is to use r.Patch:
https://pkg.go.dev/sigs.k8s.io/[email protected]/pkg/client#Client
Therefore, users of controller-runtime would use constructs like:

	if err := r.Patch(ctx, resource, client.Apply, client.ForceOwnership, client.FieldOwner(bpfmanConfig.Name)); err != nil {

A problem with this are the unit tests which do not support Server-Side Apply, and thus an interceptor for Patch requests is needed:

r.Apply() was introduced with controller-runtime v0.22.0 which has not been included in the operator-sdk at time of this writing:

Starting with 0.22.0, r.Apply() is available (and the mock client should work with that, as well)
compare https://pkg.go.dev/sigs.k8s.io/[email protected]/pkg/client#Writer -> https://pkg.go.dev/sigs.k8s.io/[email protected]/pkg/client#Writer
Current operator-sdk latest (v1.41.1) uses controller-runtime 0.21.0)

Improved reconciliation logic

I decided to maintain reconciliation logic as is, meaning whenever any object is changed, created, deleted we run through the entire reconciliation logic. As we are using the controller-runtime's cache (which in turn uses k8s informers), load on the API server will be fairly minimal and will be limited to the actual r.Patch requests, only (gets come from cache and we only patch when we need something new or when something deviates).

Each r.Patch against any of the owned resources will always lead to another reconciliation run which checks all owned resources.
Especially on initialization with none of the resources existing (which is the worst case scenario), this leads to several reconciliation runs which might not seem as if they were needed.

run 1 creates all objects

{"level":"info","ts":"2025-09-14T13:03:24Z","logger":"Config","msg":"Running the reconciler"}
{"level":"info","ts":"2025-09-14T13:03:24Z","logger":"Config","msg":"Adding finalizer to Config","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:24Z","logger":"Config","msg":"Running the reconciler"}
{"level":"info","ts":"2025-09-14T13:03:24Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:ConfigMap,APIVersion:v1,}","namespace":"bpfman","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:24Z","logger":"Config","msg":"Patching object","type":"&TypeMeta{Kind:ConfigMap,APIVersion:v1,}","namespace":"bpfman","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"csi.bpfman.io","path":"./config/bpfman-deployment/csidriverinfo.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:CSIDriver,APIVersion:storage.k8s.io/v1,}","namespace":"","name":"csi.bpfman.io"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Patching object","type":"&TypeMeta{Kind:CSIDriver,APIVersion:storage.k8s.io/v1,}","namespace":"","name":"csi.bpfman.io"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-daemon","path":"./config/bpfman-deployment/daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-daemon"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Patching object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-daemon"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-metrics-proxy","path":"./config/bpfman-deployment/metrics-proxy-daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-metrics-proxy"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Patching object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-metrics-proxy"}

each created object leads to an entire reconcile, again (no-op against the API server)

{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Running the reconciler"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:ConfigMap,APIVersion:v1,}","namespace":"bpfman","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"csi.bpfman.io","path":"./config/bpfman-deployment/csidriverinfo.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:CSIDriver,APIVersion:storage.k8s.io/v1,}","namespace":"","name":"csi.bpfman.io"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-daemon","path":"./config/bpfman-deployment/daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-daemon"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-metrics-proxy","path":"./config/bpfman-deployment/metrics-proxy-daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-metrics-proxy"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Running the reconciler"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:ConfigMap,APIVersion:v1,}","namespace":"bpfman","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"csi.bpfman.io","path":"./config/bpfman-deployment/csidriverinfo.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:CSIDriver,APIVersion:storage.k8s.io/v1,}","namespace":"","name":"csi.bpfman.io"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-daemon","path":"./config/bpfman-deployment/daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-daemon"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-metrics-proxy","path":"./config/bpfman-deployment/metrics-proxy-daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-metrics-proxy"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Running the reconciler"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:ConfigMap,APIVersion:v1,}","namespace":"bpfman","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"csi.bpfman.io","path":"./config/bpfman-deployment/csidriverinfo.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:CSIDriver,APIVersion:storage.k8s.io/v1,}","namespace":"","name":"csi.bpfman.io"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-daemon","path":"./config/bpfman-deployment/daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-daemon"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-metrics-proxy","path":"./config/bpfman-deployment/metrics-proxy-daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-metrics-proxy"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Running the reconciler"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:ConfigMap,APIVersion:v1,}","namespace":"bpfman","name":"bpfman-config"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"csi.bpfman.io","path":"./config/bpfman-deployment/csidriverinfo.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:CSIDriver,APIVersion:storage.k8s.io/v1,}","namespace":"","name":"csi.bpfman.io"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-daemon","path":"./config/bpfman-deployment/daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-daemon"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Loading object","object":"bpfman-metrics-proxy","path":"./config/bpfman-deployment/metrics-proxy-daemonset.yaml"}
{"level":"info","ts":"2025-09-14T13:03:25Z","logger":"Config","msg":"Getting object","type":"&TypeMeta{Kind:DaemonSet,APIVersion:apps/v1,}","namespace":"bpfman","name":"bpfman-metrics-proxy"}

I believe the above is the correct approach: Not a whole lot of documentation is available about this, but the Java Operator SDK is pretty adamant about the fact that reconciliation should always reconcile all resources:
https://javaoperatorsdk.io/docs/getting-started/patterns-best-practices/

Implementing a Reconciler
Always Reconcile All Resources

Reconciliation can be triggered by events from multiple sources. It might be tempting to check the events and only reconcile the related resource or subset of resources that the controller manages. However, this is considered an anti-pattern for operators.

Why this is problematic:

    Kubernetes’ distributed nature makes it difficult to ensure all events are received
    If your operator misses some events and doesn’t reconcile the complete state, it might operate with incorrect assumptions about the cluster state
    Always reconcile all resources, regardless of the triggering event

Event Sources and Caching

During reconciliation, best practice is to reconcile all dependent resources managed by the controller. This means comparing the desired state with the actual cluster state.

The Challenge: Reading the actual state directly from the Kubernetes API Server every time would create significant load.

The Solution: Create a watch for dependent resources and cache their latest state using the Informer pattern. In JOSDK, informers are wrapped into EventSource to integrate with the framework’s eventing system via the InformerEventSource class.

Idempotency

Since all resources should be reconciled when your Reconciler is triggered, and reconciliations can be triggered multiple times for any given resource (especially with retry policies), it’s crucial that Reconciler implementations be idempotent.

In the same line of thought, the controller-runtime documentation clearly states that controllers should not distinguish between create/update/delete events: https://github.com/kubernetes-sigs/controller-runtime/blob/main/FAQ.md#q-how-do-i-have-different-logic-in-my-reconciler-for-different-types-of-events-eg-create-update-delete

About owned secondary resources:

…s/component-update-ocp-bpfman-agent chore(deps): update ocp-bpfman-agent to cf30ca8

Replace the Get/Create and Get/Update reconciliation pattern with Server-Side Apply (SSA) using client.Apply patches. This simplifies resource management by handling both creation and updates in a single operation while providing better conflict resolution and field ownership tracking. Changes: - Add TypeMeta fields to all Kubernetes objects for proper SSA support - Change assureResource logic to use r.Patch with client.Apply - Add test interceptor to handle SSA patches in fake client Signed-off-by: Andreas Karis <[email protected]>

andreaskaris changed the title ~~Config follow-up: Implement Server-Side Apply~~ WIP: Config follow-up: Implement Server-Side Apply Sep 3, 2025

andreaskaris marked this pull request as draft September 3, 2025 16:42

andreaskaris force-pushed the reconcile-plus-ssa branch 3 times, most recently from c97aab8 to fa02f77 Compare September 4, 2025 10:21

frobware pushed a commit to frobware/bpfman-operator that referenced this pull request Sep 5, 2025

Merge pull request bpfman#472 from openshift/konflux/component-update…

f0f32e9

…s/component-update-ocp-bpfman-agent chore(deps): update ocp-bpfman-agent to cf30ca8

andreaskaris force-pushed the reconcile-plus-ssa branch 3 times, most recently from ba3e236 to 4b5edb8 Compare September 14, 2025 13:35

andreaskaris force-pushed the reconcile-plus-ssa branch from 4b5edb8 to c97f5ac Compare September 26, 2025 13:13

andreaskaris marked this pull request as ready for review September 26, 2025 13:13

andreaskaris changed the title ~~WIP: Config follow-up: Implement Server-Side Apply~~ Config follow-up: Implement Server-Side Apply Sep 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Config follow-up: Implement Server-Side Apply #472

Config follow-up: Implement Server-Side Apply #472

Uh oh!

andreaskaris commented Sep 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Config follow-up: Implement Server-Side Apply #472

Are you sure you want to change the base?

Config follow-up: Implement Server-Side Apply #472

Uh oh!

Conversation

andreaskaris commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Server-Side Apply

Status of Server-Side Apply native support in controller-runtime

Improved reconciliation logic

Uh oh!

Uh oh!

andreaskaris commented Sep 3, 2025 •

edited

Loading