Skip to content

Conversation

LarsSven
Copy link
Contributor

@LarsSven LarsSven commented Aug 7, 2025

It builds on top on #48990, which has not been merged yet, so that PR must be merged first before this becomes reviewable. This PR currently contains the code for both PRs.

Applying manifests onto the Kubernetes dev service does not work if the application connects to an existing dev service. I originally thought this was an issue in the manifests feature, but it turns out to be an issue in the Kubernetes dev service setup. Basically, the producer was never called in the code path where a dev service already exists, so KubernetesDevServiceInfoBuildItem would be null in depending BuildSteps.

This is not code I originally contributed so I did my best to use the features I could easily find. Definitely let me know if some of this code can be cleaned up.

I also made a small change in the manifests application code I contributed earlier to make sure it wouldn't log Applied manifest if a failure did occur. It now only prints this if no failure has occurred.

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 7, 2025

The last commit: fc79114, is the change that is actually part of this PR. The other commits are part of the previous PR.

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 7, 2025

@metacosm @geoand

@LarsSven LarsSven changed the title Ls/produce on existing devservice Still produce KubernetesDevServiceInfoBuildItem when an dev service already exists Aug 7, 2025
@LarsSven LarsSven changed the title Still produce KubernetesDevServiceInfoBuildItem when an dev service already exists Still produce KubernetesDevServiceInfoBuildItem when a dev service already exists Aug 7, 2025
KubeConfigUtils.parseKubeConfig(KubeConfigUtils.replaceServerInKubeconfig(containerAddress.getUrl(),
getFileContentFromContainer(KIND_KUBECONFIG))));
return KubeConfigUtils
.parseKubeConfig(KubeConfigUtils.replaceServerInKubeconfig("https://" + containerAddress.getUrl(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying manifests fail without "https://", which is also present in the container creation process. Basically this makes the kubeconfig exactly equal to the one that is made when creating a new Kubernetes dev service container from scratch. Not sure why there was a difference between the two to begin with.

Copy link

quarkus-bot bot commented Aug 7, 2025

Status for workflow Quarkus Documentation CI

This is the status report for running Quarkus Documentation CI on commit fc79114.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

Warning

There are other workflow runs running, you probably need to wait for their status before merging.

Copy link

github-actions bot commented Aug 7, 2025

🎊 PR Preview 6bb2e94 has been successfully built and deployed to https://quarkus-pr-main-49415-preview.surge.sh/version/main/guides/

  • Images of blog posts older than 3 months are not available.
  • Newsletters older than 3 months are not available.

This comment has been minimized.

@metacosm
Copy link
Contributor

metacosm commented Aug 8, 2025

I think we should merge #48990 first. Also, there are upcoming changes in how dev services are handled and this PR touches on things that we should probably change in the short term, see in particular: quarkiverse/quarkus-neo4j#382 (comment). More specifically, I'm not sure that producing the item if the processor didn't start the dev service is a good idea.

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 8, 2025

I absolutely agree with merging #48990 first. Is there anything still blocking that? Like a second review or changes on my side?

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 8, 2025

As for producing the item if the processor didn't start the dev service, is this is intentional, I can maybe move the retrieval of a running dev service to the manifest application logic? That way the build step no longer relies on the setup producing a KubernetesDevServiceInfoBuildItem. However it seemed logical to me that KubernetesDevServiceInfoBuildItem should be available if the dev service is available. What is the reasoning for why the Kubernetes dev service info should not be available if it connects to an already running dev service?

But yeah, if you don't want to change the way the setup works, I can probably fix it in the manifest application only. I just thought that not producing the build item in the already running code path was a bug in the kubernetes extension, and not an intentional decision.

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 8, 2025

I've synced the fork so the previous merged PR has now been incorporated.

@metacosm how would you like to proceed. Should I make the setup produce the build item on existing dev services or fix it in a way that it is only contained within the manifest application code?

@metacosm
Copy link
Contributor

metacosm commented Aug 8, 2025

I admit that I'm not sure what's the best way to proceed. Refactoring to use the new architecture would probably solve the issue cleanly but the upgrade path doesn't look super straightforward and might take some time to implement. On the other hand, like I said, I'm not sure that producing the build item is correct if the processor is reusing an existing instance of the dev service. What do you think, @holly-cummins?

@holly-cummins
Copy link
Contributor

I think moving to the new model would satisfy the requirement here (it's got a cleaner 'service reuse' flow), but it's a heavy lift, and I know @LarsSven has already done a lot of work, so I'm reluctant to suggest we throw that all away. :(

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 8, 2025

I can also rewrite the manifest application such that it does not depend on the build item anymore (by retrieving the kubeconfig itself). That would be minimal solution to fix the bug without affecting anything else

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 8, 2025

The amount of work for this PR is not that much, I don't mind rewriting it to something different, as long as the bug this PR tries to solve gets fixed.

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 9, 2025

Also @metacosm could you explain to me the intention / description of the KubernetesDevServiceInfoBuildItem? Because my assumption was that this item just gives you all the necessary information of the Kubernetes dev service that is being used by the application in dev/test, but if it makes sense for it to be null when a dev service is already running, and should only be set if the application creates the dev service, there is something more going on about it that I don't understand.

@metacosm
Copy link
Contributor

Also @metacosm could you explain to me the intention / description of the KubernetesDevServiceInfoBuildItem? Because my assumption was that this item just gives you all the necessary information of the Kubernetes dev service that is being used by the application in dev/test, but if it makes sense for it to be null when a dev service is already running, and should only be set if the application creates the dev service, there is something more going on about it that I don't understand.

I didn't work on that part so I'm almost as clueless as you on this. It just seems, based on the changes to the architecture, that there was some issues with the semantics of that build item and the new architecture is supposed to clear that up.

@LarsSven
Copy link
Contributor Author

So shall I make an alternative implementation that doesn't make any changes to the build item and instead decouples the manifest application from the build item altogether?

@holly-cummins
Copy link
Contributor

So shall I make an alternative implementation that doesn't make any changes to the build item and instead decouples the manifest application from the build item altogether?

I would wait before trying that, because I think we should be able to use the build item, because the "build info is null when service is already running" problem will be fixed when I convert the service to the new model. I'm hoping to start tomorrow.

But one thing that would be really helpful to me is some tests, so that I know I'm meeting the new behaviour requirements (or at least, not going in totally the wrong direction and making things harder).

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 11, 2025

I would wait before trying that, because I think we should be able to use the build item, because the "build info is null when service is already running" problem will be fixed when I convert the service to the new model. I'm hoping to start tomorrow.

Okay thank you. I did start trying this before I saw the message and I can confidently say now that trying to get the container information from the manifest application code gets really ugly. It is definitely possible, but when this information is not provided by a previous buildstep, suddenly this buildstep now has to deal with things like figuring out whether it's in a compose environment, whether there is a shared network, etc. So the code for it really doesn't look pretty. I would definitely prefer a solution where the previous buildstep just provides the necessary details even if it connects to an existing service.

I can provide you some manual steps for testing. I wouldn't know how to set up an integration test that spawns multiple Quarkus applications though within the Quarkus integration test framework. The project we work on that is using this feature is open-source so if you'd like to I can give you the instructions about how to reproduce the issue this PR is trying to fix on our own project, that way you can test against it when rewriting this to see if the issue is fixed?

@LarsSven
Copy link
Contributor Author

LarsSven commented Aug 11, 2025

I've branched off our feature branch that started breaking so that any changes we make in the next few weeks to our own code won't affect your testing. You can find it here: https://gitlab.com/rug-digitallab/products/themis/themis-core/-/tree/ls/quarkus-manifests-reproducer (branch ls/quarkus-manifests-reproducer). We won't touch this branch.

This branch is set to use 999-SNAPSHOT of Quarkus, so you need to locally build Quarkus to maven local to make this work (which I assume is what you would want for this test).

You can reproduce the issue as follows:

  1. ./gradlew runtime-reporter:quarkusDev. This should log during startup about applying a few manifests to the Kubernetes dev service.
  2. ./gradlew runtime-orchestrator:service:quarkusDev.

That second step will hit this log message during startup:

log.warn("Cannot apply manifests because the Kubernetes dev service is not running");

It hits this message because kubernetesDevServiceInfoBuildItem == null when you are connecting to an existing kubernetes dev service, which shouldn't be happening. Manifests should be applied even if connecting to an existing dev service.

If you do manage to fix it, you may still encounter manifest application errors because the runtime-orchestrator tries to apply manifests that were already applied by the runtime-reporter. That is perfectly expected and not a problem, as long as it actually tries to apply the manifests when starting up the second application.

@LarsSven
Copy link
Contributor Author

@holly-cummins are these tests enough for you or would you like any other reproducer/information?

@LarsSven
Copy link
Contributor Author

@holly-cummins @metacosm this bug is currently blocking the development of our application quite significantly, so I would really love to have this bug fixed in the current dev service architecture if we're not yet ready to migrate to the new dev service architecture.

Is it possible to merge the current PR or a modified version of this so we can get this bug fixed in the short term? Thanks!

@LarsSven
Copy link
Contributor Author

I'd also be fine with making more changes to this PR if you'd like it to be implemented differently than how it is currently implemented. What do you think?

@metacosm
Copy link
Contributor

@LarsSven sorry, I will loop back on this tomorrow. I've been busy with other stuff.

@LarsSven
Copy link
Contributor Author

Absolutely no worries, hope I'm not too much of a bother.

@LarsSven
Copy link
Contributor Author

LarsSven commented Sep 2, 2025

@metacosm a gentle reminder about this.

@metacosm
Copy link
Contributor

metacosm commented Sep 5, 2025

I haven't forgotten this issue. Hopefully, I'll have something for you to test by the beginning of next week, @LarsSven.

@LarsSven
Copy link
Contributor Author

LarsSven commented Sep 5, 2025

Thank you, I really appreciate it.

Copy link
Contributor

@metacosm metacosm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's going to take longer to move the dev service to the new architecture and this change looks OK to me.

@LarsSven LarsSven force-pushed the ls/produce-on-existing-devservice branch from 57c2d6a to a7e5cab Compare September 10, 2025 07:13
@LarsSven
Copy link
Contributor Author

Perfect! I cleaned up the commit log

@LarsSven LarsSven force-pushed the ls/produce-on-existing-devservice branch from f2ecadd to a7e5cab Compare September 10, 2025 07:15
Copy link

quarkus-bot bot commented Sep 10, 2025

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit a7e5cab.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.

@holly-cummins holly-cummins merged commit 20ff061 into quarkusio:main Sep 10, 2025
27 checks passed
@quarkus-bot quarkus-bot bot added this to the 3.29 - main milestone Sep 10, 2025
@holly-cummins
Copy link
Contributor

holly-cummins commented Sep 10, 2025

Merging. @LarsSven, is the impact of this fix for you big enough to need backporting to the 3.28 stream (we just branched it)? At least we know it will be fixed in 3.29, either by this, or by #49944 if that lands before the release cutoff (which I'm hoping it will). I'd also like to see some tests for this scenario to make sure we don't accidentally regress it as we continue to work on the dev services.

@LarsSven
Copy link
Contributor Author

LarsSven commented Sep 10, 2025

I'm torn on how important it is. On one hand, yes this is very important for us that this works, on the other hand, we have very thorough review cycles so given the complexity of our code that is using this, it's quite likely to take another month before it merges into production anyways. So I think the answer to that question is: We would like it to be in 3.28 if possible, but if it really is a big headache on your end, we can wait a release.

As for tests, I was thinking of this, but to properly test this in integration-tests, you need to spin up multiple Quarkus dev instances, and then test the manifest application on the non-primary instance, which I haven't figured out how to do in the integration-tests module.

@LarsSven LarsSven deleted the ls/produce-on-existing-devservice branch September 10, 2025 10:26
@holly-cummins
Copy link
Contributor

As for tests, I was thinking of this, but to properly test this in integration-tests, you need to spin up multiple Quarkus dev instances, and then test the manifest application on the non-primary instance, which I haven't figured out how to do in the integration-tests module.

Yeah, it's not easy. I think the pattern in https://github.com/quarkusio/quarkus/pull/48787/files#diff-6f2bb1a3ed479c9e91782edc6b6a2a9b7152dba3bfc3053f9de2ec0f6c936d27 should be generalisable. There's a DevModeTest that drives the dev service, but also a beforeAll() which manually starts a container. Another option would be to use some of the fabric8 maven plugins to start a container in the pom, but that's less controllable and less clear than doing it in the Java code.

@gsmet gsmet modified the milestones: 3.29 - main, 3.28.0 Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

quarkus.kubernetes-client.devservices.manifests does not work well with multiple dev services

4 participants