-
Notifications
You must be signed in to change notification settings - Fork 108
Split and move ssh logic to experimental/ssh #3544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
10 failing tests:
|
b3a35d3
to
e86af23
Compare
e86af23
to
78e037d
Compare
60fc8a3
to
8be1123
Compare
78e037d
to
bf3373e
Compare
8be1123
to
7ebfe7e
Compare
bf3373e
to
2222cb9
Compare
2222cb9
to
94c2ae2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked at the unit tests closely, but worry that they can run for a long time or be flaky. A good practice is to include a context.Context
everywhere and add a short-lived deadline to tear down everything it spawns.
err := client.RunClient(ctx, wsClient, opts) | ||
if err != nil && proxy.IsNormalClosure(err) { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normal closure check was added here and removed from the server case.
Can they both be omitted (or hoisted into the "run" function)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improved in #3569
cmd.Flags().DurationVar(&shutdownDelay, "shutdown-delay", 10*time.Minute, "Delay before shutting down after no pings from clients") | ||
cmd.Flags().StringVar(&secretsScope, "secrets-scope-name", "", "Databricks secrets scope name") | ||
cmd.MarkFlagRequired("secrets-scope-name") | ||
cmd.Flags().StringVar(&publicKeySecretName, "client-key-name", "", "Databricks client key name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an actual key name or the secret name in the secret scope?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secret name in the scope - you have to combine it with secrets-scope-name
to get the value.
Server uses the scope not only for downloading the client key value though, but also for managing its own key-pair
cmd.MarkFlagRequired("cluster") | ||
cmd.Flags().IntVar(&maxClients, "max-clients", 10, "Maximum number of SSH clients") | ||
cmd.Flags().DurationVar(&shutdownDelay, "shutdown-delay", 10*time.Minute, "Delay before shutting down after no pings from clients") | ||
cmd.Flags().StringVar(&secretsScope, "secrets-scope-name", "", "Databricks secrets scope name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The product name is "secret scope" (singular).
"server", | ||
f"--cluster={ctx.clusterId}", | ||
f"--secrets-scope-name={secrets_scope}", | ||
f"--client-key-name={public_key_secret_name}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend being more verbose to make it unambiguous what's being referred to here:
Idea:
--keypair-secret-scope-name
--keypair-secret-key-name
Or pubkey
if the secret scope only stores the public key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The secret scope has 3 keys - public and private keys of the server, and the public key of the client (to be added to AuthorizedKeysFile).
Maybe --keys-secret-scope-name
and --authorized-key-secret-name
?
Technically we don't even need to pass these options to the server, as it can re-create the names using the same constants and logic the client uses. Since the client enforces the same server version, it will be safe. It doesn't hurt to keep server a bit more isolated though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the goal is to keep these isolated, then all of them can specify the scope name.
--server-pubkey-secret-name
--server-privkey-secret-name
--client-pubkey-secret-name
With the format being scope/key
, like https://docs.databricks.com/aws/en/security/secrets/secrets-spark-conf-env-var#reference-a-secret-with-a-spark-configuration-property.
That removes all implicits.
ClientPublicKeyName: defaultClientPublicKeyName, | ||
ServerTimeout: serverTimeout, | ||
} | ||
err := client.RunClient(ctx, wsClient, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be shortened to client.Run
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it in #3569
PortRange: serverPortRange, | ||
} | ||
return err | ||
return server.RunServer(ctx, client, opts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be shortened to server.Run
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The client
variable here is unfortunate.
Common variable names for the workspace client include w
or wsc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it in #3569
shutdownTimerMu sync.Mutex | ||
connections map[string]*proxy.ProxyConnection | ||
connectionsMu sync.Mutex | ||
TimedOut chan bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal/server
package listens to this channel (and terminates the process)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, it indeed doesn't make sense in this PR, but in the follow up I've moved connections
to the internal/proxy
, so the export is needed there
libs/ssh/ssh-server-bootstrap.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't this implemented in the previous PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was, git didn't register this as a rename for some reason. The main thing I've changed is removing databricks_ssh_key env var, since secrets logic was moved into the server itself
SSHDConfigPath string | ||
} | ||
|
||
func newSSHHandler(ctx context.Context, connections *connectionsManager, sshDConfigPath string) *sshHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be a separate PR:
There is mixed casing for ssh/sshd. You can stick to just lowercase and only upcase the first char when it's the second word in an unexported variable, or the first in an exported variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the follow up the pure http.Handler part of the logic was moved to the internal/proxy
, and sshD*
mixed case is fixed there.
If you mean that I sometimes use ssh and sometimes SSH - it seems to be in line with go
camel case + acronyms style? And the real reason is because Ssh
throws me off :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also sshD
, and the newSSHHandler
also looks weird because of the double H.
They can be just sshd
(referring to the process name), and newSshHandler
to stick to Go patterns.
Or alternatively, move everything into an ssh
package and call it handler
, configPath
, etc.
Move from flat structure to cmd and internal client/server/proxy/keys/workspace sub packages
94c2ae2
to
8f69376
Compare
Moved the tests to use the new |
"server", | ||
f"--cluster={ctx.clusterId}", | ||
f"--secrets-scope-name={secrets_scope}", | ||
f"--client-key-name={public_key_secret_name}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the goal is to keep these isolated, then all of them can specify the scope name.
--server-pubkey-secret-name
--server-privkey-secret-name
--client-pubkey-secret-name
With the format being scope/key
, like https://docs.databricks.com/aws/en/security/secrets/secrets-spark-conf-env-var#reference-a-secret-with-a-spark-configuration-property.
That removes all implicits.
SSHDConfigPath string | ||
} | ||
|
||
func newSSHHandler(ctx context.Context, connections *connectionsManager, sshDConfigPath string) *sshHandler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also sshD
, and the newSSHHandler
also looks weird because of the double H.
They can be just sshd
(referring to the process name), and newSshHandler
to stick to Go patterns.
Or alternatively, move everything into an ssh
package and call it handler
, configPath
, etc.
…3569) ## Changes Based on #3544 Move pure client and server logic to the proxy package and test it. Now `proxyConnection` is completely private, and `proxy` package exposes only 2 public functions: - `RunClientProxy` function - `NewProxyServer` constructor Meanwhile the internal logic doesn't depend on any ssh stuff and can be unit tested. `internal/client` and `internal/server` packages are now much smaller and contain mainly ssh/jobs/cluster glue. Also adding `bench.sh` until we have proper e2e tests. ## Why Tests ## Tests Unit and manual tests <!-- If your PR needs to be included in the release notes for next release, add a separate entry in NEXT_CHANGELOG.md as part of your PR. -->
Changes
Previous PRs in the stack:
Move from flat structure to cmd and internal client/server/proxy/keys/workspace sub packages.
Moved a bunch of hardcoded values to constants (some hardcoded stuff remains in the
keys
andworkspace
sub packages, will be improved in a follow up)Added unit tests for servers connection manager logic and updated proxy tests.
The coverage isn't great yet, the goal is to avoid unit tests with excessive mocking and focus on integration tests from now on.
Why
/experimental
already has the databricks-bundles python project, seems like a good place to put ssh stuff tooTests
Unit and manual