-
Notifications
You must be signed in to change notification settings - Fork 553
Add Kubernetes Deployer for Pipeline Deployments #4127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Documentation Link Check Results✅ Absolute links check passed |
| sanitized_key = self._sanitize_secret_key(key, secret_key_map) | ||
| if sanitized_key != key: | ||
| logger.warning( | ||
| f"Secret key '{key}' sanitized to '{sanitized_key}'" |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 11 days ago
The best way to fix the problem is to avoid logging the actual value of key (from the secrets dictionary) in clear text. The log message currently exposes the original secret key and the sanitized version of it, which could leak both internal secret naming conventions and potentially sensitive user-supplied names.
To address this, the log message should be reworded so that neither the original nor sanitized secret key are included verbatim. Instead, you can log the fact that a secret key was sanitized, possibly including metadata such as an index, length, or even just a generic message without displaying the keys themselves.
Specific changes:
- Edit only the log statement on line 517 in
src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py. - Remove or obfuscate both
keyandsanitized_keyfrom the log output, or limit the output to a generic message such as: "A secret key was sanitized.". - No import or method changes are needed as the log level, logger, etc. remain unchanged.
-
Copy modified line R517
| @@ -514,7 +514,7 @@ | ||
| sanitized_key = self._sanitize_secret_key(key, secret_key_map) | ||
| if sanitized_key != key: | ||
| logger.warning( | ||
| f"Secret key '{key}' sanitized to '{sanitized_key}'" | ||
| "A secret key was sanitized to conform to Kubernetes naming requirements." | ||
| ) | ||
| sanitized[sanitized_key] = value | ||
|
|
stefannica
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only took a brief look at this PR, but I can already suggest a few major items:
-
please move the dry-run feature in a separate PR. 6k LOC is a big PR to have to review and this feature doesn't belong here. In addition to the scope creep, it doesn't seem like you put too much thought into its design.
-
parts of the code still feel like they've been written by an LLM that doesn't doesn't know how your Pydantic types are defined.
getattrandhasattrshould never appear in your code. Please review your own code and make sure that it's not vibe-coded before you open it for review.
| """Yield single resources from possibly 'List' wrappers. | ||
| Args: | ||
| objs: Iterable of resource dicts (already normalized). Supports 'kind: List'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Supports 'kind: List'. This is not clear at all what it does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated docstring to explain this.
Kubernetes allows manifests that wrap multiple resources in a single object
kind: List
items:
- kind: Deployment
...
- kind: Service
...
| ) | ||
| return host | ||
|
|
||
| def _svc_lb_host(self, service_obj: Any) -> Optional[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why Any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are using kubernetes client which also get resource type as any and basically it could return:
• a raw dict coming from to_dict() or some other JSON-ish source, or
• a Kubernetes client / dynamic client object (e.g. ResourceInstance) that has a .to_dict() method but doesn’t have a nice static type in the Python client.
| raise ValueError( | ||
| f"Expected dict after serialization, got {type(d)}" | ||
| ) | ||
| if "api_version" in d and "apiVersion" not in d: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that a bug in the Kubernetes package that they don't convert this one exact key?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically kubernetes only accepted apiVersion but the api_version is usually the variable name in python client so this is just extra step to not fail on that naming otherwise this will happen because user provided wrong key and we can also just fail
| from typing import Any, Dict, List | ||
|
|
||
| import yaml | ||
| from jinja2 import Environment, StrictUndefined, TemplateError, Undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My question still stands: Where is jinja getting installed from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just added it to k8s integration requirements
Describe changes
This PR introduces a new Kubernetes Deployer that enables deploying ZenML pipelines as long-running services on any Kubernetes cluster.
Pre-requisites
Please ensure you have done the following:
developand the open PR is targetingdevelop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes