-
Notifications
You must be signed in to change notification settings - Fork 306
#827 get from db ports firstly #893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
BioQwer
wants to merge
2
commits into
jupyterhub:main
Choose a base branch
from
BioQwer:fix_827
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+4
−1
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not usually be correct. The persisted
server.portis the connect port, whereasself.portsets the bind port (should almost never have a value other than the default in kubespawner).Plus, access to the db from Spawners is deprecated and discouraged, so I don't think we should add this.
Can you share more about what problem this aims to solve? Maybe the answer is somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minrk
Yes, of course.
Context
We run Hadoop and Spark on YARN.
To simplify the deployment of a Spark driver in JupyterLab we set the following
KubeSpawneroptions:Situation
When we use a static port, only one Lab instance can be started per node,
which is insufficient for our workload.
This issue first appeared when we had 5 nodes serving 15 Hub users.
To work around the limitation we introduced a pre‑spawn hook that selects a
random port in the range 9000‑9299:
The Hub remembers the chosen port and uses it for health‑check requests, so
the solution works well—provided the Hub process stays alive.
If the Hub pod restarts, it loses the mapping of Lab instances to their
assigned ports, and the health checks start failing.
We also tried falling back to the default port
8888for the health check,but when no response is received the Hub deletes the Lab that is running on
the random port, which is not the behavior we want.
Key points
it in the Hub for health checks.
orphaned Lab pods or premature deletions.
Feel free to let me know if any part needs more detail!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying that this is about host networking. Do you see
url changed!in your logs when the hub restarts?I think you're right that it will do the wrong thing if
.portis specified from anything other than static config, specifically here it assumesget_pod_url()is right (usesself.portby default) and the persisted db value inself.serveris wrong, but in your case, it is the opposite.That code is really there to deal with cluster networking changes, so maybe we should either remove or reverse the port logic, leaving only the host? Either that or persist/restore
self.portin load_state/get_state. I doubt persisting self.port is right, though. I'll need to think through some cases to know which is right. Removing the port check is the simplest and usually probably the right thing to do.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think I know what would be simplest and most correct: add
self.portto the pod manifest in the annotationhub.jupyter.org/port, then retrieve that in_get_pod_urlinstead of usingself.portunconditionally.self.portcan then be a fallback if undefined (e.g. across an upgrade).Do you want to have a go at tackling that? If not, I can probably do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it's persisted, but after restart hub.
hub clear port value.
We can't delete port.
You should know port value for liveness probe.
http://<k8s_ip>:/api/
Should i do fix something for merge PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i debug this situation
before
12348888but ip is okafter
12341234it only for get previous config of early started pod.
i'm resolve persisting in db, by changing here.
why should i get from
get_pod_manifest, if hub reading from db ?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is in get_pod_url using self.port instead of the actual port when the pod is running. Fixing that will fix the problem. Relying on deprecated db access will eventually break, and is not the right thing to do when the pod is not running. The fix is to persist the port in the pod manifest via the annotation, so get_pod_url gets the right value, and self.port config will still have the right effect rather than being overridden.
Another, smaller fix would be to replace the netloc check with only a hostname check, so that we don't rewrite the port. I'm not sure if there are situations where the port could change, but we know there are where the ip changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure that it fix it?
What will we do if not fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it will
Keep working to fix it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@minrk i understand that i have not time to refactor this.
It's working on production for 2 months.
Many not investing time to JH because have this problem.