Skip to content

Conversation

glehmann
Copy link
Member

@glehmann glehmann commented Sep 10, 2025

so we can change the interface used to communicate with the
container.
Using an image tag allows to automatically download the expected image on
the developer's computer in case of update.

In the future, if we distribute tagged versions of xcp-ng-dev (on pypi for
example), we might want to use that version instead.

Related discussions from #37:

@psafont
Copy link

psafont commented Sep 10, 2025

It's still not clear why a "container interface version" is needed at all. I've not seen this pattern being used by any other container supplier.

It would also might make it more difficult to integrate mock's use of containers for bootstrapping builds: https://rpm-software-management.github.io/mock/Feature-container-for-bootstrap.html

@stormi
Copy link
Member

stormi commented Sep 10, 2025

I don't have an opinion on this particular change that I haven't reviewed. My main comment is just that we're targeting at most 50 users of this tool in the next years, almost 100% of which we internally train, so just make sure not to complicate current use nor future evolutions for the sake of publishing with the highest standards for python packages and containers (and it's really easy to build or update the container locally). We're not even sure we'll keep using such containers for local builds in the long run, when mock's build process is cleaner because it starts from a really minimal build root, and allows to reproduce what koji does.

Note: if my comment doesn't make sense, just ignore it, I just dumped my thoughts at this point.

@glehmann glehmann force-pushed the gln/image-tag-version-zqly branch from 8cd9978 to 4f37aff Compare September 10, 2025 15:33
Now that the images are easily available on ghcr.io.
It makes the completion easier to use for the main tool.

Signed-off-by: Gaëtan Lehmann <[email protected]>
so we can change the interface used to communicate with the
container.
Using an image tag allows to automatically download the expected image on
the developer's computer in case of update.

In the future, if we distribute tagged versions of xcp-ng-dev (on pypi for
example), we might want to use that version instead.

Signed-off-by: Gaëtan Lehmann <[email protected]>
@glehmann glehmann force-pushed the gln/image-tag-version-zqly branch from 4f37aff to dda80ec Compare September 11, 2025 08:29
@ydirson
Copy link

ydirson commented Sep 11, 2025

It's still not clear why a "container interface version" is needed at all. I've not seen this pattern being used by any other container supplier.

The whole goal is that when the interface between the cli and the container (essentially init scripts) changes, we don't leave the user hitting random behaviors.
I originally suggested a different mechanism but both ideas address that same problem, using container tags just leverages an existing mechanism, which avoids adding more complexity in our code.

Maybe we should step back think about what we want to happen when we update this tool and/or the container.
I think the central point is we want people to be as uptodate as possible, both for tool and container.

That would point to 2 needed changes:

  1. we cannot force-update the script, that would violate expectations. But we could warn the user when the script is outdated (which would be much easier if we installed from pypi.org)
  2. can we force-update the image? Since we cannot do that for the script, it could be awkward to do that for the image, but:
  • the image is an implementation detail, I don't think we don't want to tell the users to call {podman,docker} pull ghcr.io/xcp-ng/xcp-ng-build-env:8.3 themselves
  • I don't see an easy way to check if the image tag was updated in the registry (ghcr.io/almalinux for example replies to podman search with a 403)

So the best course I see would be to (probably by default, with a flag to disable it) issue the image pull from the client script. But then, when the script is not uptodate, pulling a new image may just break a working setup. What could be the options?

  1. pull only if the script is uptodate
  2. pull the latest compatible image, using tags to check compatibility
  3. pull only if the script is uptodate, and if not uptodate check compatibility of compatible image by recording the protocol version inside the image, and make the script-uptodate warning an error if not compatible (this seems to address rare uses only, but will only hit users who use the tool very occasionally, so I think it is important we take care of them)
  4. any other?

So this PR ultimately addresses option 2. But maybe option 1 is simpler / more acceptable?
One disadvantage of option 2 (as implemented using container tags) would be that older tags would keep image prune from freeing disk space used by those.

It would also might make it more difficult to integrate mock's use of containers for bootstrapping builds: https://rpm-software-management.github.io/mock/Feature-container-for-bootstrap.html

Why would we want to use out build-env container to bootstrap mock, rather than a generic Fedora or Alma?

@psafont
Copy link

psafont commented Sep 11, 2025

pull the latest compatible image, using tags to check compatibility

As far as I understand these are the cases we want to control:

  • User executes an up-to-date tool with an up-to-date container: build proceeds as expected
  • User executes an up-to-date tool with an outdated, incompatible container: the tooling tells it to download a newer container, or download an older tool.
  • User executes an outdated tool with an outdated, but compatible container: build proceeds as expected
  • User executes an outdated tool with an up-to-date, and incompatible container: the tooling tells it to download a newer tool, or use an older container.

I think tags is the wrong place to enforce this behaviour, because we don't control the behaviour of docker of podman, nor their errors, instead I would like to propose an alternative to encode this versioning:

Embed a version to the contents of the container, probably into the entrypoint, change the entrypoint to have a container-version argument, make the tool to call the container entrypoint with this new argument; and finally add the check to the entrypoint script to output the correct error to guide the user to do the right action.

Bonus points if the error message or documentatione numerate the versions of the tool and the dates of the container using the current interface version. (of course the information for the current version will be incomplete, but it can have the previous ones in updated containers / tools)

Why would we want to use out build-env container to bootstrap mock, rather than a generic Fedora or Alma?

Because all the base deps are already in the container and don't need to be redownloaded

@ydirson
Copy link

ydirson commented Sep 11, 2025

pull the latest compatible image, using tags to check compatibility

As far as I understand these are the cases we want to control:

* User executes an up-to-date tool with an up-to-date container: build proceeds as expected

* User executes an up-to-date tool with an outdated, incompatible container: the tooling tells it to download a newer container, or download an older tool.

* User executes an outdated tool with an outdated, but compatible container:  build proceeds as expected

* User executes an outdated tool with an up-to-date, and incompatible container: the tooling tells it to download a newer tool, or use an older container.

A problem is, I'm not sure we can tell whether the container is up-to-date (so all we can check is compatibility):

$ podman search ghcr.io/xcp-ng/xcp-ng-build-env
Error: 1 error occurred:
	* couldn't search registry "ghcr.io": Requesting bearer token: invalid status code from registry 403 (Forbidden)

Also, do we really want to ever tell the user to "use a newer (or older) container"?

I think tags is the wrong place to enforce this behaviour, because we don't control the behaviour of docker of podman, nor their errors, instead I would like to propose an alternative to encode this versioning:

Embed a version to the contents of the container, probably into the entrypoint, change the entrypoint to have a container-version argument, make the tool to call the container entrypoint with this new argument; and finally add the check to the entrypoint script to output the correct error to guide the user to do the right action.

That's the general idea I meant with option 3, the difference being I'm also suggesting the tool updates the container when itself is uptodate.

Bonus points if the error message or documentatione numerate the versions of the tool and the dates of the container using the current interface version. (of course the information for the current version will be incomplete, but it can have the previous ones in updated containers / tools)

Why would we want to use out build-env container to bootstrap mock, rather than a generic Fedora or Alma?

Because all the base deps are already in the container and don't need to be redownloaded

@psafont
Copy link

psafont commented Sep 11, 2025

A problem is, I'm not sure we can tell whether the container is up-to-date (so all we can check is compatibility):

The uses case I brought forward is not concerned with that, only with the incompatibilities. I want to make sure we solve one problem at a time to make progress

@glehmann
Copy link
Member Author

* User executes an up-to-date tool with an up-to-date container: build proceeds as expected

* User executes an up-to-date tool with an outdated, incompatible container: the tooling tells it to download a newer container, or download an older tool.

* User executes an outdated tool with an outdated, but compatible container:  build proceeds as expected

* User executes an outdated tool with an up-to-date, and incompatible container: the tooling tells it to download a newer tool, or use an older container.

I don't think that's what we want. At least this is not what I want.
I want:

  • User executes any version of the tool without the matching image on the host: the tool downloads the image and starts the build
  • User executes any version of the tool with the matching image on the host: the tool starts the build

We could use the tool version instead of the protocol version in the tag. We could put the XCP-ng version in the tag as well or in the image name. All of that works for me.

  • ghcr.io/xcp-ng/xcp-ng-build-env:0.1.0-8.3
  • ghcr.io/xcp-ng/xcp-ng-build-env/xcp-ng-8.3:0.1.0

@glehmann
Copy link
Member Author

And we should probably talk about that in real life next week (with a beer) 🙂

@ydirson
Copy link

ydirson commented Sep 11, 2025

* User executes an up-to-date tool with an up-to-date container: build proceeds as expected

* User executes an up-to-date tool with an outdated, incompatible container: the tooling tells it to download a newer container, or download an older tool.

* User executes an outdated tool with an outdated, but compatible container:  build proceeds as expected

* User executes an outdated tool with an up-to-date, and incompatible container: the tooling tells it to download a newer tool, or use an older container.

I don't think that's what we want. At least this is not what I want. I want:

* User executes any version of the tool without the matching image on the host: the tool downloads the image and starts the build

* User executes any version of the tool with the matching image on the host: the tool starts the build

That looks like a description of "how" rather than "what", no?

We could use the tool version instead of the protocol version in the tag. We could put the XCP-ng version in the tag as well or in the image name. All of that works for me.

* `ghcr.io/xcp-ng/xcp-ng-build-env:0.1.0-8.3`

* `ghcr.io/xcp-ng/xcp-ng-build-env/xcp-ng-8.3:0.1.0`

* …

Well, every the tool version does not necessarily introduce breaking protocol changes. We would have to ignore at least the "patchlevel" component of version. And this would raise the question of what image to use during development of a protocol-breaking feature. In short, IMO the protocol-version works much better.

@psafont
Copy link

psafont commented Sep 11, 2025

That looks like a description of "how" rather than "what", no?

Yes, the image may be already available locally, or it may be downloaded as part of the tool being executed. Regardless of what podman or docker do, we end up in 4 cases regarding incompatibility. This is why I want to decide on what happens on incompatible interfaces. We can deal with downloads of up-to-date containers and others at another time

Well, every the tool version does not necessarily introduce breaking protocol changes. We would have to ignore at least the "patchlevel" component of version. And this would raise the question of what image to use during development of a protocol-breaking feature. In short, IMO the protocol-version works much better.

I agree, I don't want to couple the container tags with the version of the tool

@glehmann
Copy link
Member Author

That looks like a description of "how" rather than "what", no?

Yes, the image may be already available locally, or it may be downloaded as part of the tool being executed. Regardless of what podman or docker do, we end up in 4 cases regarding incompatibility. This is why I want to decide on what happens on incompatible interfaces. We can deal with downloads of up-to-date containers and others at another time

There are four cases only if you are considering using a non-matching version. I propose not to do that.

Also I'm not sure what you mean by "how" vs "what". We both described a different behavior. How is one version more "how" and the other more "what"?

Well, every the tool version does not necessarily introduce breaking protocol changes. We would have to ignore at least the "patchlevel" component of version.

Why would we ignore the patch level?
Docker images deal efficiently with storage and partial changes, thanks to layers.
And updating the tags doesn't mean updating the image if nothing has changed.

And this would raise the question of what image to use during development of a protocol-breaking feature.

How would the protocol version make the situation better?

I agree, I don't want to couple the container tags with the version of the tool

The container is just a part of the tool, so it makes plenty of sense to couple them IMO?
I'm ok to not do that, but I would prefer to understand why.

@ydirson
Copy link

ydirson commented Sep 11, 2025

That looks like a description of "how" rather than "what", no?

Yes, the image may be already available locally, or it may be downloaded as part of the tool being executed. Regardless of what podman or docker do, we end up in 4 cases regarding incompatibility. This is why I want to decide on what happens on incompatible interfaces. We can deal with downloads of up-to-date containers and others at another time

There are four cases only if you are considering using a non-matching version. I propose not to do that.

Also I'm not sure what you mean by "how" vs "what". We both described a different behavior. How is one version more "how" and the other more "what"?

After re-reading, right. I'm struggling to pinpoint how the reasoning diverge 😅

In fact if I just look in Pau's 4 items at the 2 "up-to-date" condition on one hand, and the 2 "incompatible" ones, I can see the outcome as essentially identical in each group... and that matches quite well your own 2 items. Aren't you in fact already agreeing on this point?

Well, every the tool version does not necessarily introduce breaking protocol changes. We would have to ignore at least the "patchlevel" component of version.

Why would we ignore the patch level?

By definition we don't want a "micro version bump" to change an interface, thus there should be no point in including it by its very definition.

Docker images deal efficiently with storage and partial changes, thanks to layers. And updating the tags doesn't mean updating the image if nothing has changed.

Docker/podman tags are not "free". When a tag is updated, there is no ref left to the former layers, and docker image prune is sufficient to reclaim disk space. If we multiply tags, it adds burden to the user, to track which old tags he does not need any more and remove them... when those image should IMO just stay an implementation detail of the build env.

And this would raise the question of what image to use during development of a protocol-breaking feature.

How would the protocol version make the situation better?

We would not have to tag/publish/whatever a new version to use a new protocol version, any commit can change it. That's a lot simpler.

@psafont
Copy link

psafont commented Sep 11, 2025

The container is just a part of the tool, so it makes plenty of sense to couple them IMO?

Just because there are two modules in a single project, it doesn't mean they need to be coupled. Not coupling allows us to share containers for more cases (like mock, as I said before)

@glehmann
Copy link
Member Author

In fact if I just look in Pau's 4 items at the 2 "up-to-date" condition on one hand, and the 2 "incompatible" ones, I can see the outcome as essentially identical in each group... and that matches quite well your own 2 items. Aren't you in fact already agreeing on this point?

I'm not sure. Maybe it would actually be easier to find out with a beer ;)

Well, every the tool version does not necessarily introduce breaking protocol changes. We would have to ignore at least the "patchlevel" component of version.

Why would we ignore the patch level?

By definition we don't want a "micro version bump" to change an interface, thus there should be no point in including it by its very definition.

Well, there are other things than the interface that could be updated. For example installing the updates in the image.

Docker images deal efficiently with storage and partial changes, thanks to layers. And updating the tags doesn't mean updating the image if nothing has changed.

Docker/podman tags are not "free". When a tag is updated, there is no ref left to the former layers, and docker image prune is sufficient to reclaim disk space. If we multiply tags, it adds burden to the user, to track which old tags he does not need any more and remove them... when those image should IMO just stay an implementation detail of the build env.

That's a good point

And this would raise the question of what image to use during development of a protocol-breaking feature.

How would the protocol version make the situation better?

We would not have to tag/publish/whatever a new version to use a new protocol version, any commit can change it. That's a lot simpler.

Weren't you already thinking of distributing the package on pypi? Tagging an image shouldn't be more difficult, and perhaps even quite easier.

@glehmann
Copy link
Member Author

The container is just a part of the tool, so it makes plenty of sense to couple them IMO?

Just because there are two modules in a single project, it doesn't mean they need to be coupled.

But it doesn't mean that it wouldn't be easier to couple them.

Not coupling allows us to share containers for more cases (like mock, as I said before)

We want to integrate the mock usage in xcp-ng-dev. xcp-ng-dev, knowing its own version, can use it to point to the expected image.

I'll stop there because it looks more like incompatible opinions than an actual comparison of solutions.
Let's try without protocol or tool version, and enhance things later if/when needed.

@psafont
Copy link

psafont commented Sep 12, 2025

But it doesn't mean that it wouldn't be easier to couple them.

And once want to decouple them, how easy or hard would it be to do? :)

knowing its own version, can use it to point to the expected image.

I don't think it's that easy, at least I would like to upstream in the future the container configuration to mock directly. That way any mock user can build packages for xcp-ng. xcp-ng-dev would be a tool that helps set up mock, but not necessary (like I'm doing now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants