Skip to content

Conversation

sanchezl
Copy link
Contributor

@sanchezl sanchezl commented Feb 11, 2025

In some instances, the controller would re-queue an image pull secret in such a way that the check for an expiring token would only occur after the token had expired. This would result in a gap of time in which the pull secret contents would be invalid.

@sanchezl sanchezl changed the title correct calculated refresh time OCPBUGS-50507: Intermittent authentication issues when accessing OpenShift registry Feb 11, 2025
@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Feb 11, 2025
@openshift-ci-robot
Copy link
Contributor

@sanchezl: This pull request references Jira Issue OCPBUGS-50507, which is invalid:

  • expected the bug to target the "4.19.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 11, 2025
@sanchezl
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Feb 11, 2025
@openshift-ci-robot
Copy link
Contributor

@sanchezl: This pull request references Jira Issue OCPBUGS-50507, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.19.0) matches configured target version for branch (4.19.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@ricardomaraschini ricardomaraschini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to move us to a better place! Thanks for that. I would like to see some unit tests for the registryAuthenticationFileValid() and refreshThresholdTime(), if that is possible.

func refreshThresholdTime(nbf, exp time.Time) time.Time {
// calculate the time at which only 40% of the valid duration would be left
validDuration := exp.Sub(nbf)
return exp.Add(-time.Duration(int64(float64(validDuration) * 0.4)))
Copy link
Contributor

@ricardomaraschini ricardomaraschini Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we guaranteed to always get exp > nbf here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be very unlikely, it means k8s would have issued an invalid token. I added a check anyway.

}

func (c *imagePullSecretController) registryAuthenticationFileValid(imagePullSecret *corev1.Secret, urls, kids []string) (bool, time.Time) {
func (c *imagePullSecretController) registryAuthenticationFileValid(imagePullSecret *corev1.Secret, urls, kids []string, now time.Time) (bool, *time.Time) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be a good idea moving parts of this function to isSecretRefreshNeeded() instead ? I mean, this function does some validations (if the secret is of the right Type for instance) while also validates if the tokens in a valid secret haven't expired. I see this as two distinct things: in one the secret is invalid while in the other the tokens are invalid. It is up to you but I believe that splitting would make things easier to reason about.

The way things are here the only purpose of isSecretRefreshNeeded() is to invert the value of valid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some unit tests to test the logic as well?

Copy link
Contributor

openshift-ci bot commented Mar 26, 2025

@sanchezl: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 43e8fd9 link false /test security
ci/prow/okd-scos-e2e-aws-ovn 43e8fd9 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Member

@vrutkovs vrutkovs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 26, 2025
Copy link
Contributor

openshift-ci bot commented Mar 26, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sanchezl, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sanchezl
Copy link
Contributor Author

/label acknowledge-critical-fixes-only

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Mar 26, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 49eb6f8 into openshift:master Mar 26, 2025
11 of 13 checks passed
@openshift-ci-robot
Copy link
Contributor

@sanchezl: Jira Issue OCPBUGS-50507: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-50507 has been moved to the MODIFIED state.

In response to this:

In some instances, the controller would re-queue an image pull secret in such a way that the check for an expiring token would only occur after the token had expired. This would result in a gap of time in which the pull secret contents would be invalid.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-openshift-controller-manager
This PR has been included in build ose-openshift-controller-manager-container-v4.20.0-202503262140.p0.g49eb6f8.assembly.stream.el9.
All builds following this will include this PR.

@sanchezl
Copy link
Contributor Author

/cherry-pick release-4.18

@openshift-cherrypick-robot

@sanchezl: new pull request created: #369

In response to this:

/cherry-pick release-4.18

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants