Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RHOAIENG-9935: feat(nbcs): get ose-oauth-proxy image with PullIfNeeded policy and use digest to specify the image #374

Conversation

jiridanek
Copy link
Member

@jiridanek jiridanek commented Aug 1, 2024

https://issues.redhat.com/browse/RHOAIENG-9935

Revival of @shalberd's PR, now targetted against v1.7-branch.

I used https://github.com/opendatahub-io/kubeflow/pull/119.patch to get a patch that I applied against v1.7-branch, to avoid having to rebase the original PR branch.

I also incorporated change to the manifests

to be in sync with https://github.com/red-hat-data-services/kubeflow/blob/master/components/odh-notebook-controller/config/manager/manager.yaml#L28

and with operator annotations at https://github.com/red-hat-data-services/RHOAI-Build-Config/blob/main/catalog/v4.13/rhods-operator/catalog.yaml#L340

  • TODO: create a followup issue to change pull policy for the controller image itself; this looks safe, because we always reference it with either digest or with a tag that includes commit hash in itself. also change github workflow (the sed) when making this change

Motivation

Currently, the ose-oauth-proxy sidecar container of a running notebook, handled by odh notebook controller, always pulls the image from an external location. Current ose-oauth-proxy image is v4.10.
This always image pulling is problematic from a stability perspective, as, should connectivity to an external repo ever cease, no image pulling on notebook pod start is possible.

Also, providing the image in tag format is not compatible with disconnected cluster installs that use mirroring and soft-linking from source to target. Those on-prem installs need image references to be in sha256 digest format.

Another reason for changing to sha256 digest format: Tag ose-oauth-proxy:v4.10 changes. If imagePullPolicy is set to IfNotPresent, what matters is that an image with that tag is already cached on one of the cluster nodes. So, to keep a unique image version and build status for ose-oauth-proxy, we need to go with sha256 digest format as well.

Description

changed from tag format to digest format in odh notebook controller webhook that adds an ose-oauth-proxy sidecar to a Notebook CR

Manifest link digest of v4.10 from July 6 2023:

ab112105ac37352a2a4916a39d6736f5db6ab4c29bad4467de8d613e80e9bb33

https://catalog.redhat.com/software/containers/openshift4/ose-oauth-proxy/5cdb2133bed8bd5717d5ae64?architecture=amd64&tag=v4.10.0-202306170106.p0.g799d414.assembly.stream&push_date=1688610772000&container-tabs=gti

Put in comments with respect to actual tag used and location to look at for details. Changed imagepullPolicy from Always to IfNotPresent.

Related PR in odh-manifests:

opendatahub-io/odh-manifests#868

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

@openshift-ci openshift-ci bot requested review from atheo89 and jstourac August 1, 2024 10:42
@jiridanek jiridanek force-pushed the injected_oauth_imagepullpolicy_change_and_sha256_digest branch from d0284b3 to 940b362 Compare August 1, 2024 11:11
@jiridanek
Copy link
Member Author

/cherrypick stable

@openshift-cherrypick-robot

@jiridanek: once the present PR merges, I will cherry-pick it on top of stable in a new PR and assign it to you.

In response to this:

/cherrypick stable

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

// OAuthProxyImage uses sha256 manifest list digest value of v4.8 image for AMD64 as default to be compatible with imagePullPolicy: IfNotPresent, overridable
// taken from https://catalog.redhat.com/software/containers/openshift4/ose-oauth-proxy/5cdb2133bed8bd5717d5ae64?image=6306f12280cc9b3291272668&architecture=amd64&container-tabs=overview
// and kept in sync with the manifests here and in ClusterServiceVersion metadata of opendatahub operator
OAuthProxyImage = "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:4bef31eb993feb6f1096b51b4876c65a6fb1f4401fee97fa4f4542b6b7c9bc46"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure whether to leave it as :latest here and only set hash through the manifest. It seems sensible to me to leave :latest in the code.

Copy link

@shalberd shalberd Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, basically, code here to a floating type tag like latest is best for development environment.

Since the value is overriden in manifest manager.yaml with the custom argument containing the hash, we can leave it to "latest" here.

That is what I did, too. I did not change the notebook controller code, I only put the latest digest based on tag v4.14 in my custom manifests / manager.yaml and can confirm it gets injected as specificed in manager.yaml / overridden in the workbenches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed my mind. I feel more comfortable with having the digest here, and also hardcoding the PullIfNotPresent. There's not much of real value in testing with :latest, anyways. Seems to me that using :latest oauth proxy image is almost always an error.

(It does make some sense to have the notebook controller image in :latest, but that's a different question)

For now, I'll keep the PR without making the changes I considered here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jiridanek on this, i.e. having the digest here as well, and also hardcoding the PullIfNotPresent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll be going with 4.14 version of the image, #386

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, nice

@@ -75,7 +75,7 @@ func InjectOAuthProxy(notebook *nbv1.Notebook, oauth OAuthConfig) error {
proxyContainer := corev1.Container{
Name: "oauth-proxy",
Image: oauth.ProxyImage,
ImagePullPolicy: corev1.PullAlways,
ImagePullPolicy: corev1.PullIfNotPresent,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did not set this, the defaults are Always if image tag is :latest and PullIfNotPresent otherwise, which is very sensible to me. So, maybe it would be best to leave ImagePullPolicy unset?

https://kubernetes.io/docs/concepts/containers/images#updating-images

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, I did not know this. Makes sense, though. Yeah, we can leave it out in that case.
Have not tried this behavior on an Openshift Cluster yet, though. If true, it makes the whole thing a lot easier, especially with regards to "latest".

Default image pull policy
When you (or a controller) submit a new Pod to the API server, your cluster sets the imagePullPolicy field when specific conditions are met:

if you omit the imagePullPolicy field, and you specify the digest for the container image, the imagePullPolicy is automatically set to IfNotPresent.
if you omit the imagePullPolicy field, and the tag for the container image is :latest, imagePullPolicy is automatically set to Always;
if you omit the imagePullPolicy field, and you don't specify the tag for the container image, imagePullPolicy is automatically set to Always;
if you omit the imagePullPolicy field, and you specify the tag for the container image that isn't :latest, the imagePullPolicy is automatically set to IfNotPresent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind keeping it here so that it is visible explicitly, but the truth is that if we will keep it here, and for any reason we'll get back to the :latest tag (hopefully not) in the future again, this will override the default ImagePullPolicy and will cause that only first image download is done... so maybe removing this line completely truly makes sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this further in https://issues.redhat.com/browse/RHOAIENG-10058.

@jiridanek
Copy link
Member Author

The images are available in

  • quay.io/opendatahub/odh-notebook-controller:pr-374
  • quay.io/opendatahub/kubeflow-notebook-controller:pr-374

this is written down in openshift-ci logs that are available under the "Artifacts" link, for example at https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/opendatahub-io_kubeflow/374/pull-ci-opendatahub-io-kubeflow-v1.7-branch-odh-notebook-controller-pr-image-mirror/1818967802866831360/artifacts/odh-notebook-controller-pr-image-mirror/opendatahub-io-ci-image-mirror/artifacts/mirror.log

@jiridanek
Copy link
Member Author

/unhold
I played with the :pr-374 images enough that I feel this is ready for review and comments

@openshift-ci openshift-ci bot removed the do-not-merge/hold Do not merge this PR label Aug 5, 2024
@jstourac
Copy link
Member

jstourac commented Aug 7, 2024

Thank you for these changes. I like them in general, though I feel like there are multiple different things being handled in one commit here:

  1. sync with the ose-oauth-proxy images in downstream -> effectively means downgrade in all cases here
  2. change in the pulling policy

Would be nice to have it done separate, but no big issue.

Regarding the point 1 - I wonder whether since we are touching this, shouldn't update the sha to the final ose-oauth-proxy version that is agreed on so we don't need to change this in the soon future again, WDYT?

@jiridanek
Copy link
Member Author

jiridanek commented Aug 7, 2024

Would be nice to have it done separate, but no big issue.

Guess I can still do that. @harshad16 wanted to do the pull policy change separately, so I'll let him do it. And having hash with Always policy temporarily until both PRs come in is weird but it does not break anything.

shouldn't update the sha to the final ose-oauth-proxy version that is agreed on

It's not agreed on. We'd need also pick version that handles fips, and so on. It's not a quick and low-risk job, whereas this PR pretty much is all that. And I already have a separate ticket for the update. More tickets, more velocity. Corporate FTW

…d policy and use digest to specify the image
@jiridanek jiridanek force-pushed the injected_oauth_imagepullpolicy_change_and_sha256_digest branch from 940b362 to 8223fbb Compare August 7, 2024 10:19
@harshad16
Copy link
Member

Guess I can still do that. harshad16 wanted to do the pull policy change separately, so I'll let him do it. And having hash with Always policy temporarily until both PRs come in is weird but it does not break anything.

Thanks for this , we can have it in a separate PR.

Copy link
Member

@harshad16 harshad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

thanks 👍

Copy link

openshift-ci bot commented Aug 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harshad16

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-cherrypick-robot

@jiridanek: new pull request created: #377

In response to this:

/cherrypick stable

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jiridanek jiridanek deleted the injected_oauth_imagepullpolicy_change_and_sha256_digest branch August 9, 2024 12:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants