Skip to content

Conversation

@safoinme
Copy link
Contributor

@safoinme safoinme commented Nov 4, 2025

Describe changes

This PR introduces a new Kubernetes Deployer that enables deploying ZenML pipelines as long-running services on any Kubernetes cluster.

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • IMPORTANT: I made sure that my changes are reflected properly in the following resources:
    • ZenML Docs
    • Dashboard: Needs to be communicated to the frontend team.
    • Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
    • Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

@github-actions github-actions bot added internal To filter out internal PRs and issues enhancement New feature or request labels Nov 4, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

Documentation Link Check Results

Absolute links check passed
Relative links check passed
Last checked: 2025-11-14 11:36:54 UTC

sanitized_key = self._sanitize_secret_key(key, secret_key_map)
if sanitized_key != key:
logger.warning(
f"Secret key '{key}' sanitized to '{sanitized_key}'"

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.
This expression logs
sensitive data (secret)
as clear text.

Copilot Autofix

AI 11 days ago

The best way to fix the problem is to avoid logging the actual value of key (from the secrets dictionary) in clear text. The log message currently exposes the original secret key and the sanitized version of it, which could leak both internal secret naming conventions and potentially sensitive user-supplied names.

To address this, the log message should be reworded so that neither the original nor sanitized secret key are included verbatim. Instead, you can log the fact that a secret key was sanitized, possibly including metadata such as an index, length, or even just a generic message without displaying the keys themselves.

Specific changes:

  • Edit only the log statement on line 517 in src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py.
  • Remove or obfuscate both key and sanitized_key from the log output, or limit the output to a generic message such as: "A secret key was sanitized.".
  • No import or method changes are needed as the log level, logger, etc. remain unchanged.

Suggested changeset 1
src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py b/src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py
--- a/src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py
+++ b/src/zenml/integrations/kubernetes/deployers/kubernetes_deployer.py
@@ -514,7 +514,7 @@
             sanitized_key = self._sanitize_secret_key(key, secret_key_map)
             if sanitized_key != key:
                 logger.warning(
-                    f"Secret key '{key}' sanitized to '{sanitized_key}'"
+                    "A secret key was sanitized to conform to Kubernetes naming requirements."
                 )
             sanitized[sanitized_key] = value
 
EOF
@@ -514,7 +514,7 @@
sanitized_key = self._sanitize_secret_key(key, secret_key_map)
if sanitized_key != key:
logger.warning(
f"Secret key '{key}' sanitized to '{sanitized_key}'"
"A secret key was sanitized to conform to Kubernetes naming requirements."
)
sanitized[sanitized_key] = value

Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
Copy link
Contributor

@stefannica stefannica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only took a brief look at this PR, but I can already suggest a few major items:

  1. please move the dry-run feature in a separate PR. 6k LOC is a big PR to have to review and this feature doesn't belong here. In addition to the scope creep, it doesn't seem like you put too much thought into its design.

  2. parts of the code still feel like they've been written by an LLM that doesn't doesn't know how your Pydantic types are defined. getattr and hasattr should never appear in your code. Please review your own code and make sure that it's not vibe-coded before you open it for review.

@stefannica stefannica added the run-slow-ci Tag that is used to trigger the slow-ci label Nov 12, 2025
"""Yield single resources from possibly 'List' wrappers.
Args:
objs: Iterable of resource dicts (already normalized). Supports 'kind: List'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supports 'kind: List'. This is not clear at all what it does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docstring to explain this.

Kubernetes allows manifests that wrap multiple resources in a single object

kind: List
items:
  - kind: Deployment
    ...
  - kind: Service
    ...

)
return host

def _svc_lb_host(self, service_obj: Any) -> Optional[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Any

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we are using kubernetes client which also get resource type as any and basically it could return:
• a raw dict coming from to_dict() or some other JSON-ish source, or
• a Kubernetes client / dynamic client object (e.g. ResourceInstance) that has a .to_dict() method but doesn’t have a nice static type in the Python client.

raise ValueError(
f"Expected dict after serialization, got {type(d)}"
)
if "api_version" in d and "apiVersion" not in d:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a bug in the Kubernetes package that they don't convert this one exact key?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically kubernetes only accepted apiVersion but the api_version is usually the variable name in python client so this is just extra step to not fail on that naming otherwise this will happen because user provided wrong key and we can also just fail

from typing import Any, Dict, List

import yaml
from jinja2 import Environment, StrictUndefined, TemplateError, Undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My question still stands: Where is jinja getting installed from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just added it to k8s integration requirements

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request internal To filter out internal PRs and issues run-slow-ci Tag that is used to trigger the slow-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants