-
Notifications
You must be signed in to change notification settings - Fork 3k
Still produce KubernetesDevServiceInfoBuildItem when a dev service already exists #49415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Still produce KubernetesDevServiceInfoBuildItem when a dev service already exists #49415
Conversation
The last commit: fc79114, is the change that is actually part of this PR. The other commits are part of the previous PR. |
KubeConfigUtils.parseKubeConfig(KubeConfigUtils.replaceServerInKubeconfig(containerAddress.getUrl(), | ||
getFileContentFromContainer(KIND_KUBECONFIG)))); | ||
return KubeConfigUtils | ||
.parseKubeConfig(KubeConfigUtils.replaceServerInKubeconfig("https://" + containerAddress.getUrl(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applying manifests fail without "https://", which is also present in the container creation process. Basically this makes the kubeconfig exactly equal to the one that is made when creating a new Kubernetes dev service container from scratch. Not sure why there was a difference between the two to begin with.
Status for workflow
|
🎊 PR Preview 6bb2e94 has been successfully built and deployed to https://quarkus-pr-main-49415-preview.surge.sh/version/main/guides/
|
This comment has been minimized.
This comment has been minimized.
I think we should merge #48990 first. Also, there are upcoming changes in how dev services are handled and this PR touches on things that we should probably change in the short term, see in particular: quarkiverse/quarkus-neo4j#382 (comment). More specifically, I'm not sure that producing the item if the processor didn't start the dev service is a good idea. |
I absolutely agree with merging #48990 first. Is there anything still blocking that? Like a second review or changes on my side? |
As for producing the item if the processor didn't start the dev service, is this is intentional, I can maybe move the retrieval of a running dev service to the manifest application logic? That way the build step no longer relies on the setup producing a KubernetesDevServiceInfoBuildItem. However it seemed logical to me that KubernetesDevServiceInfoBuildItem should be available if the dev service is available. What is the reasoning for why the Kubernetes dev service info should not be available if it connects to an already running dev service? But yeah, if you don't want to change the way the setup works, I can probably fix it in the manifest application only. I just thought that not producing the build item in the already running code path was a bug in the kubernetes extension, and not an intentional decision. |
I've synced the fork so the previous merged PR has now been incorporated. @metacosm how would you like to proceed. Should I make the setup produce the build item on existing dev services or fix it in a way that it is only contained within the manifest application code? |
I admit that I'm not sure what's the best way to proceed. Refactoring to use the new architecture would probably solve the issue cleanly but the upgrade path doesn't look super straightforward and might take some time to implement. On the other hand, like I said, I'm not sure that producing the build item is correct if the processor is reusing an existing instance of the dev service. What do you think, @holly-cummins? |
I think moving to the new model would satisfy the requirement here (it's got a cleaner 'service reuse' flow), but it's a heavy lift, and I know @LarsSven has already done a lot of work, so I'm reluctant to suggest we throw that all away. :( |
I can also rewrite the manifest application such that it does not depend on the build item anymore (by retrieving the kubeconfig itself). That would be minimal solution to fix the bug without affecting anything else |
The amount of work for this PR is not that much, I don't mind rewriting it to something different, as long as the bug this PR tries to solve gets fixed. |
Also @metacosm could you explain to me the intention / description of the KubernetesDevServiceInfoBuildItem? Because my assumption was that this item just gives you all the necessary information of the Kubernetes dev service that is being used by the application in dev/test, but if it makes sense for it to be null when a dev service is already running, and should only be set if the application creates the dev service, there is something more going on about it that I don't understand. |
I didn't work on that part so I'm almost as clueless as you on this. It just seems, based on the changes to the architecture, that there was some issues with the semantics of that build item and the new architecture is supposed to clear that up. |
So shall I make an alternative implementation that doesn't make any changes to the build item and instead decouples the manifest application from the build item altogether? |
I would wait before trying that, because I think we should be able to use the build item, because the "build info is null when service is already running" problem will be fixed when I convert the service to the new model. I'm hoping to start tomorrow. But one thing that would be really helpful to me is some tests, so that I know I'm meeting the new behaviour requirements (or at least, not going in totally the wrong direction and making things harder). |
Okay thank you. I did start trying this before I saw the message and I can confidently say now that trying to get the container information from the manifest application code gets really ugly. It is definitely possible, but when this information is not provided by a previous buildstep, suddenly this buildstep now has to deal with things like figuring out whether it's in a compose environment, whether there is a shared network, etc. So the code for it really doesn't look pretty. I would definitely prefer a solution where the previous buildstep just provides the necessary details even if it connects to an existing service. I can provide you some manual steps for testing. I wouldn't know how to set up an integration test that spawns multiple Quarkus applications though within the Quarkus integration test framework. The project we work on that is using this feature is open-source so if you'd like to I can give you the instructions about how to reproduce the issue this PR is trying to fix on our own project, that way you can test against it when rewriting this to see if the issue is fixed? |
I've branched off our feature branch that started breaking so that any changes we make in the next few weeks to our own code won't affect your testing. You can find it here: https://gitlab.com/rug-digitallab/products/themis/themis-core/-/tree/ls/quarkus-manifests-reproducer (branch This branch is set to use 999-SNAPSHOT of Quarkus, so you need to locally build Quarkus to maven local to make this work (which I assume is what you would want for this test). You can reproduce the issue as follows:
That second step will hit this log message during startup: Line 188 in f530bd5
It hits this message because If you do manage to fix it, you may still encounter manifest application errors because the runtime-orchestrator tries to apply manifests that were already applied by the runtime-reporter. That is perfectly expected and not a problem, as long as it actually tries to apply the manifests when starting up the second application. |
@holly-cummins are these tests enough for you or would you like any other reproducer/information? |
@holly-cummins @metacosm this bug is currently blocking the development of our application quite significantly, so I would really love to have this bug fixed in the current dev service architecture if we're not yet ready to migrate to the new dev service architecture. Is it possible to merge the current PR or a modified version of this so we can get this bug fixed in the short term? Thanks! |
I'd also be fine with making more changes to this PR if you'd like it to be implemented differently than how it is currently implemented. What do you think? |
@LarsSven sorry, I will loop back on this tomorrow. I've been busy with other stuff. |
Absolutely no worries, hope I'm not too much of a bother. |
@metacosm a gentle reminder about this. |
I haven't forgotten this issue. Hopefully, I'll have something for you to test by the beginning of next week, @LarsSven. |
Thank you, I really appreciate it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's going to take longer to move the dev service to the new architecture and this change looks OK to me.
57c2d6a
to
a7e5cab
Compare
Perfect! I cleaned up the commit log |
f2ecadd
to
a7e5cab
Compare
Status for workflow
|
Merging. @LarsSven, is the impact of this fix for you big enough to need backporting to the 3.28 stream (we just branched it)? At least we know it will be fixed in 3.29, either by this, or by #49944 if that lands before the release cutoff (which I'm hoping it will). I'd also like to see some tests for this scenario to make sure we don't accidentally regress it as we continue to work on the dev services. |
I'm torn on how important it is. On one hand, yes this is very important for us that this works, on the other hand, we have very thorough review cycles so given the complexity of our code that is using this, it's quite likely to take another month before it merges into production anyways. So I think the answer to that question is: We would like it to be in 3.28 if possible, but if it really is a big headache on your end, we can wait a release. As for tests, I was thinking of this, but to properly test this in integration-tests, you need to spin up multiple Quarkus dev instances, and then test the manifest application on the non-primary instance, which I haven't figured out how to do in the integration-tests module. |
Yeah, it's not easy. I think the pattern in https://github.com/quarkusio/quarkus/pull/48787/files#diff-6f2bb1a3ed479c9e91782edc6b6a2a9b7152dba3bfc3053f9de2ec0f6c936d27 should be generalisable. There's a |
quarkus.kubernetes-client.devservices.manifests
does not work well with multiple dev services #49381It builds on top on #48990, which has not been merged yet, so that PR must be merged first before this becomes reviewable. This PR currently contains the code for both PRs.
Applying manifests onto the Kubernetes dev service does not work if the application connects to an existing dev service. I originally thought this was an issue in the manifests feature, but it turns out to be an issue in the Kubernetes dev service setup. Basically, the producer was never called in the code path where a dev service already exists, so
KubernetesDevServiceInfoBuildItem
would be null in depending BuildSteps.This is not code I originally contributed so I did my best to use the features I could easily find. Definitely let me know if some of this code can be cleaned up.
I also made a small change in the manifests application code I contributed earlier to make sure it wouldn't log
Applied manifest
if a failure did occur. It now only prints this if no failure has occurred.