Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Conformance]: Adds GatewayClass SupportedVersion Test #3368

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

danehans
Copy link
Contributor

What type of PR is this?
/kind test
/area conformance

What this PR does / why we need it:
Adds a test to ensure implementations conform to the GatewayClass SupportedVersion status condition.

Which issue(s) this PR fixes:
Fixes #3367

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/test do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. area/conformance cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 27, 2024
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 27, 2024
@youngnick
Copy link
Contributor

The change basically lgtm, but you'll need to remove the "fixes ..." from the commit message @danehans, Prow doesn't let us do that in commit messages.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Sep 30, 2024
@danehans
Copy link
Contributor Author

Commit b40de38 removes "Fixes..." from the commit message. Thanks for the review @youngnick.

@arkodg
Copy link
Contributor

arkodg commented Sep 30, 2024

thanks @danehans, does this test pass on any implementations ?

@danehans
Copy link
Contributor Author

danehans commented Oct 1, 2024

@arkodg thanks for the review. Yes, this PR was tested with solo-io/gloo#10140 for Gloo Gateway:

go test ./conformance -run TestConformance -args \
    --gateway-class=gloo-gateway \
    --supported-features=Gateway \
    --run-test=GatewayClassSupportedVersionCondition
ok  	sigs.k8s.io/gateway-api/conformance	18.789s

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @danehans!

Adds a test to ensure implementations conform to the GatewayClass
SupportedVersion status condition.

Signed-off-by: Daneyon Hansen <[email protected]>
@danehans
Copy link
Contributor Author

@arkodg @robscott I have addressed your comments. PTAL when you have a moment.

Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks !
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arkodg, danehans

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 19, 2024
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @danehans!

}

// Ensure the SupportedVersion is false
kubernetes.GWCMustHaveSupportedVersionConditionFalse(t, s.Client, s.TimeoutConfig, gwc.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some implementations like GKE will ~immediately overwrite this CRD with the original version, can you add that as an acceptable outcome of this test?

Features: []features.FeatureName{
features.SupportGateway,
},
Description: "A GatewayClass should set the SupportedVersion condition based on the presence and version of Gateway API CRDs in the cluster",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this makes sense.

Consider I build an implementation that supports v1.1.

v1.2 comes out. Per the spec, even if I am willing to support it, I must set the condition to "false"

This means I cannot pass this conformance test for 1.2 until I upgrade to 1.2 which doesn't actually seem desired?

}

// Ensure the SupportedVersion status condition is false
kubernetes.GWCMustHaveSupportedVersionConditionFalse(t, s.Client, s.TimeoutConfig, gwc.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really expect implementations to have to watch CRDs and constantly reconcile the Gatewayclass? I get how we got to this point, but it feels like we are building APIs in isolation from considerations about user needs and implementation complexity.

Users are unlikely to ever have an incompatibility and somehow realize to check a random object (that they don't even create, so why would they know to look at it?).

The cost to implement this, however, is extremely high...

Copy link
Contributor

@howardjohn howardjohn Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the api-machinery folks have, historically, advised against an implementation reading CRDs at all or even assuming an API is defined by a CRD (vs another mechanism).

I realize this has not much to do with the test... but I missed the PR adding the condition

Copy link
Member

@mlavacca mlavacca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with @howardjohn about this feature: it looks like we are tightly coupling the implementation's internal dependency with the installed Gateway API version. In case Implementation X uses Gateway API 1.3 as a dependency, and Gateway API 1.4 gets released, if a user performs an upgrade of the API before the implementation adoption of that specific version, the GatewayClass gets marked as not accepted anymore.

@arkodg
Copy link
Contributor

arkodg commented Nov 19, 2024

this test is adding conformance for this section https://gateway-api.sigs.k8s.io/guides/api-design/?h=supportedversi#supported-api-versions, if this is undesirable, then the section should be modified

@mlavacca
Copy link
Member

this test is adding conformance for this section https://gateway-api.sigs.k8s.io/guides/api-design/?h=supportedversi#supported-api-versions, if this is undesirable, then the section should be modified

Yes, my concern is mostly related to the feature itself, not to this specific conformance test

@youngnick
Copy link
Contributor

The docs don't say that the GatewayClass needs to not be accepted, just that the unsupportedVersion Condition needs to be added.

The intent here is to give implementations a way to say "hey, you're running with a version I don't support, maybe check that before logging a support issue". That is, it's intended to be a feature that serves both users and maintainers by giving users a way to check that they are doing supported things.

@howardjohn I'm certainly ready to hear feedback on how you think that could be achieved without doing the things documented in the test. Or if you think that the goal is not worthwhile, I'd be interested to hear why.

@howardjohn
Copy link
Contributor

The intent here is to give implementations a way to say "hey, you're running with a version I don't support, maybe check that before logging a support issue". That is, it's intended to be a feature that serves both users and maintainers by giving users a way to check that they are doing supported things.

The intent makes sense at a high level, I am just not sure it makes sense in practice. Its very easy to come up with ideas to put an ever-increasing list of bits of information into an objects status. But we should do so with extreme caution -- each one adds real risk, complexity, and scale limitations. A slippery-slope strawman is we start forcing implementations to start putting their entire documentation in the status (which is not far off from the SupportFeatures field!). If you look at Kubernetes core, you can see they are very conservative in ensure each status is only critical information.

The user journeys for this field don't make sense. Which user is going to know to look at a GatewayClass to find this field?

The cost to implement this is high for an implementation. API Machinery folks have consistently strongly recommended against controllers assuming an API is created by a CustomResourceDefinition, and instead relying on API server discovery and the standardized REST APIs to consume APIs; this conformance test strictly enforces reading CRDs.

@kflynn
Copy link
Contributor

kflynn commented Nov 21, 2024

Apologies in advance: I'm intending this as an actual question, but I expect that it may sound overly snarky, which isn't my intent.

That said -- what's the intended user experience here, and which role are we designing for? I can think of a couple of possibilities:

  1. Chihiro sets up a cluster with Gateway API v1.x and the AwesomeGatewayController, which supports v1.x. Later, they upgrade the version of Gateway API to v1.y, which AwesomeGatewayController does not support.

    I think that expecting Chihiro to look at the Gateway's status when weird things start happening is pretty reasonable. OTOH I'm a little unsure why Chihiro wouldn't just hold off on the upgrade until AwesomeGatewayController supports v1.y, since they're the one who installed AwesomeGatewayController as well?

  2. Ana wants to use the incredibleNewFeature stanza in her HTTPRoutes, but that's not supported in Gateway API v1.x, so she installs v1.y and then things break.

    I don't think we'd expect Ana to know to check the Gateway, but then... we also don't expect Ana to get to upgrade Gateway API herself, right?

So... overall, I feel like neither of these hold together all that well, which makes me wonder what I'm missing. 🙂

@howardjohn
Copy link
Contributor

So... overall, I feel like neither of these hold together all that well, which makes me wonder what I'm missing. 🙂

Its slightly worse than your examples actually -- the status is on GatewayClass not Gateway, which I feel is less likely to be looked at

@kflynn
Copy link
Contributor

kflynn commented Nov 21, 2024

Its slightly worse than your examples actually -- the status is on GatewayClass

😂 🤦‍♂️ and I even read the chunk of the doc before writing those out... clearly time for me to knock off work for the day!

But you're right: that makes things weirder still.

@youngnick
Copy link
Contributor

The idea behind all of this is to teach users to check the GatewayClass to find out what their controller supports (this is the point of the status.supportedFeatures field after all). As an extension of that, it seemed like it would be nice for the controller to have a way to signal that it didn't fully support the installed version of the CRD (which we are, again, using in a way that's not really recommended by upstream because upstream does not support what we need - a way to say "this is the released version of the schema including all compatible additions". apiVersion is only useful for information about breaking changes, not about feature additions.)

The original purpose of all of this is to at least include some information about relevant things that didn't require going out to the implementation's docs site to find the details. @howardjohn, it seems like you don't agree with that goal? Or are you worried about the asymptotic case of having too much information in the GatewayClass status?

It seems to me that our options here are as follows:

  • We stop doing anything here and remove any mention of this Condition. This means that implementations now don't need to care about looking at the installed version of CRDs, and users can enjoy the experience of heading off to the implementation's documentation page (which is not listed anywhere in the system) to try and determine the version that's running and what versions of Gateway API it supports. Hope everyone is maintaining a feature matrix!
  • We loosen this requirement to a may, and make this an extended feature, allowing implementations that feel comfortable with this to implement it as they wish. This would mean that implementations can opt-in to this Condition, but that we have conformance for it if so.
  • We keep the spec as it is and include this test also as-is. This does not seem viable, given the level of discussion on this issue already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/conformance cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/test release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a GatewayClass SupportedVersion Status Condition Test
8 participants