Skip to content

Conversation

garloff
Copy link
Member

@garloff garloff commented May 23, 2025

What this PR does / why we need it:
Fixes #226

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #226

**Special notes for your reviewer **:
This PR consists of two commits.
The first fixes the hierarchy of the fields diskBus and scsiModel to live below hardware.
The second adds a few more fields from the intersection of the SCS image metadata recommendations and the fields supported by the ORC schema.

garloff added 3 commits May 23, 2025 09:15
Only as far as the upstream ORC schema supports it.

Signed-off-by: Kurt Garloff <[email protected]>
In YAML, we typically only quote strings if they contain whitespace or
could be confused with other data types. (More quoting does not hurt
though.)

Sidenote: I was mistaken by changing the capitalization of true.
YAML-1.2.2 defines true/false as the canonical values.

Signed-off-by: Kurt Garloff <[email protected]>
@garloff garloff force-pushed the fix/diskBus-hardware branch from db18f99 to 256742c Compare May 23, 2025 07:44
@garloff
Copy link
Member Author

garloff commented May 23, 2025

How do I best test this?
Do I need to publish a CS-sha release to our registry?

@garloff
Copy link
Member Author

garloff commented May 23, 2025

In order to test, I need to create a properly authorized robot account on our registry and do something like

OCI_REGISTRY=registry.scs.community OCI_REPOSITORY=/kaas/cluster-stacks \
OCI_USERNAME='robot$KG-KaaS' OCI_PASSWORD="redacted" \
csctl create -m hash --publish --remote oci providers/openstack/scs

correct?
Is there an easier way?

@jschoone
Copy link
Contributor

jschoone commented May 23, 2025

How do I best test this? Do I need to publish a CS-sha release to our registry?

To only test if the resource is part of the cluster-class and will be deployed as expected you can just helm install the cluster-class directory on a cluster where CAPO and ORC are installed.

To check if CSO can deploy that resource accordingly you can just publish a test release as you suggested.
This can be done with -m hash or -m custom. Since hash is not good reproducible, you can create the same version format and add e.g. the current git commit hash using custom:

csctl create --publish --remote oci -m custom \
--cluster-stack-version v0-git.$(git rev-parse --short HEAD) \
--cluster-addon-version v0-git.$(git rev-parse --short HEAD) \
--node-image-version v0-git.$(git rev-parse --short HEAD) \
<PATH TO CLUSTER STACK>

Of course after git commit to have the correct hash.
Even we don't need node image here the command expected that, could be taken out of csctl but it's not that annoying right now.

Another method is to use the local mode of CSO. Then the CSO doesn't even tries to download something from remote and just expects the Cluster Stack in /tmp/download/cluster-stacks.
For that the CSO deployment must be adjusted, e.g. changing its values.yaml

controllerManager:
  manager:
    args:
      - -leader-elect=true
      - -log-level=info
      - -local

Then you can directly copy the release in the Pod (directory must be created first):

kubectl exec -ti -n cso-system deployments/cso-controller-manager -- mkdir -p /tmp/downloads/cluster-stacks/
kubectl cp <PATH TO CS RELEASE> -n cso-system <CSO-MANAGER POD>:/tmp/downloads/cluster-stacks/

The /tmp/downloads/cluster-stacks can also be in a Volume for persistence.
It's recommended to do this method with a dedicated CSO since it will fail for the already installed Cluster Stacks because it stops checking the remote source.

In both cases you still need to tell the CSO that there is a Cluster Stack using the ClusterStack resource. For hash/custom releases the ClusterStack.spec.channel should be custom instead of stable then.

@jschoone
Copy link
Contributor

In order to test, I need to create a properly authorized robot account on our registry and do something like

OCI_REGISTRY=registry.scs.community OCI_REPOSITORY=/kaas/cluster-stacks \
OCI_USERNAME='robot$KG-KaaS' OCI_PASSWORD="redacted" \
csctl create -m hash --publish --remote oci providers/openstack/scs

correct? Is there an easier way?

Yes, almost, for some reason it's OCI_REPOSITORY=$OCI_REGISTRY/kaas/cluster-stacks

@garloff
Copy link
Member Author

garloff commented May 23, 2025

Hmm, not yet solved.

{"level":"ERROR","time":"2025-05-23T08:48:41.622Z","file":"controller/controller.go:324","message":"Reconciler error","controller":"clusterstackrelease","controllerGroup":"clusterstack.x-k8s.io","controllerKind":"ClusterStackRelease","ClusterStackRelease":{"name":"openstack-scs-1-31-v0-sha-osrnsw4","namespace":"ciabns"},"namespace":"ciabn
s","name":"openstack-scs-1-31-v0-sha-osrnsw4","reconcileID":"3a232c84-9f4b-4771-a4c3-b5e894983408","error":"failed to template and apply: failed to template clusterClass 
helm chart: failed to template clusterClass helm chart: failed to run helm template for \"\": exit status 1","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/con
troller.(*Controller).reconcileHandler\n\t/src/cluster-stack-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/control
ler-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/src/cluster-stack-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/control
ler.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/src/cluster-stack-operator/vendor/sigs.k8s.io/controller-runtime/pkg/in
ternal/controller/controller.go:222"}

@jschoone
Copy link
Contributor

Hmm, not yet solved.

{"level":"ERROR","time":"2025-05-23T08:48:41.622Z","file":"controller/controller.go:324","message":"Reconciler error","controller":"clusterstackrelease","controllerGroup":"clusterstack.x-k8s.io","controllerKind":"ClusterStackRelease","ClusterStackRelease":{"name":"openstack-scs-1-31-v0-sha-osrnsw4","namespace":"ciabns"},"namespace":"ciabn
s","name":"openstack-scs-1-31-v0-sha-osrnsw4","reconcileID":"3a232c84-9f4b-4771-a4c3-b5e894983408","error":"failed to template and apply: failed to template clusterClass 
helm chart: failed to template clusterClass helm chart: failed to run helm template for \"\": exit status 1","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/con
troller.(*Controller).reconcileHandler\n\t/src/cluster-stack-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:324\nsigs.k8s.io/control
ler-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/src/cluster-stack-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/control
ler.go:261\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/src/cluster-stack-operator/vendor/sigs.k8s.io/controller-runtime/pkg/in
ternal/controller/controller.go:222"}

Hm helm template on the cluster-class dir directly works. Permissions for CSO are set I guess?

@garloff
Copy link
Member Author

garloff commented May 23, 2025

I had not set all parameters consistently, it seems.
Now making progress with v0-git.2e38b5d

NAMESPACE   NAME                        ID                                     AVAILABLE   MESSAGE
ciabns      ubuntu-capi-image-v1.31.9   2e0f8219-a2e9-4711-af0e-e320e78e16e7   False       Starting image upload

@jschoone
Copy link
Contributor

I had not set all parameters consistently, it seems. Now making progress with v0-git.2e38b5d

NAMESPACE   NAME                        ID                                     AVAILABLE   MESSAGE
ciabns      ubuntu-capi-image-v1.31.9   2e0f8219-a2e9-4711-af0e-e320e78e16e7   False       Starting image upload

Also even it has probably nothing to do with the problem, it looks like some inconsistency here.
The image resource in the branch diskBus-hardware points to Kubernetes 1.32.
Did you change it in 2e38b5d? I can't see the commit.

@garloff
Copy link
Member Author

garloff commented May 23, 2025

In git cluster-stack, we do not seem to have the right file structure to manage several versions simultaneously.
For testing, I wanted to indeed create a -v1.31.9, so I had to adjust everything to match that. Did that on a local git branch, which I just pushed to test/1.31.9.

Image registration was successful:

openstack image show ubuntu-capi-image-v1.31.9
[...]
| min_disk         | 20                                                                                                                                                  |
| min_ram          | 2048                                                                                                                                                |
| name             | ubuntu-capi-image-v1.31.9                                                                                                                           |
| owner            | 2976229d685045a39af5cfc04b88a94d                                                                                                                    |
| properties       | architecture='x86_64', direct_url='rbd://c2120a4a-669c-4769-a32c-b7e9d7b848f4/images/2e0f8219-a2e9-4711-af0e-e320e78e16e7/snap',                    |
|                  | hw_disk_bus='scsi', hw_qemu_guest_agent='true', hw_rng_model='virtio', hw_scsi_model='virtio-scsi', hw_vif_model='virtio', locations='[{'url':      |
|                  | 'rbd://c2120a4a-669c-4769-a32c-b7e9d7b848f4/images/2e0f8219-a2e9-4711-af0e-e320e78e16e7/snap', 'metadata': {'store': 'rbd'}}]', os_distro='ubuntu', |
|                  | os_hash_algo='sha512',                                                                                                                              |
|                  | os_hash_value='8090cfedfa6109cffc6b8abccee3985d5cbf0762530b0bcc1fdb8a387667529db7a4d1da8f082ae448d9b48bbb409f709a7349152ce5f1ca2f5991f51965e9fa',   |
|                  | os_hidden='False', os_version='22.04', stores='rbd' 

All the parameters I have added have successfully ended up in OpenStack's image properties.

@jschoone
Copy link
Contributor

jschoone commented May 23, 2025

In git cluster-stack, we do not seem to have the right file structure to manage several versions simultaneously

We were about to adopt the branch naming convention of Kubernetes, see https://github.com/kubernetes/release/blob/master/docs/releasing.md#release-schedule

1.28-1.30 do even exist but it needed some automation to keep them up-to-date. This was about to be implemented by #184. I think this PR is 90% done, it just needs to implement the new structure and than this can be used to manage the releasing of Cluster Stacks.

@jschoone
Copy link
Contributor

Tested the image resource, works

@garloff garloff merged commit feb240a into main May 27, 2025
4 checks passed
@garloff garloff deleted the fix/diskBus-hardware branch May 27, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ORC image: resource.properties.diskBus lacks hardware
2 participants