Skip to content

Conversation

squeed
Copy link

@squeed squeed commented Sep 11, 2025

This seemingly-minor change means that a full policy tier can be packed into a 16-bit integer (because 500 * 100 < 2^16). As written, the spec could allow up to 100,000 rules per tier, which unfortunately needs 17 bits at a worst case.

Copy link

netlify bot commented Sep 11, 2025

Deploy Preview for kubernetes-sigs-network-policy-api ready!

Name Link
🔨 Latest commit 3a4866d
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-network-policy-api/deploys/68c2860fbc30eb0008bf2afb
😎 Deploy Preview https://deploy-preview-319--kubernetes-sigs-network-policy-api.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Sep 11, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @squeed!

It looks like this is your first PR to kubernetes-sigs/network-policy-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/network-policy-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: squeed
Once this PR has been reviewed and has the lgtm label, please assign danwinship for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 11, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @squeed. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Sep 11, 2025
This seemingly-minor change means that a full policy tier can be packed
into a 16-bit integer (because 500 * 100 < 2^16). As written, the spec
could allow up to 100,000 rules per tier, which unfortunately needs 17
bits at a worst case.

Signed-off-by: Casey Callendrello <[email protected]>
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Sep 11, 2025
@tssurya
Copy link
Contributor

tssurya commented Sep 13, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 13, 2025
@aojea
Copy link
Contributor

aojea commented Sep 13, 2025

LGTM

/assign @danwinship @bowei

@danwinship
Copy link
Contributor

danwinship commented Sep 14, 2025

/hold

So, from my perspective as "someone who has not actually implemented ANP myself, and is not currently responsible for maintaining an ANP implementation"...

This is wrong. The numerical limits are not hard API guarantees that implementations can be designed around. The goal is "ensure that the total number of rules is bounded", not "ensure that the total number of rules has a specific bound".

That said, ovn-kubernetes also currently implements prioritization this way (by multiplying "max policies" times "max rules per policy") and in fact, it forces priority values to be from 0-99 rather than 0-999, because it needs to fit everything into an even smaller numeric range than you do (because it needs to share the range of OVN ACL priorities with other non-ANP rules). (FTR, ovn-k's restriction of priority to the range 0-99 is absolutely, unambiguously non-conformant, and we need a conformance test saying so.)

So, what is the argument for why 2^16 total rules is reasonable, but 2^14 is too few and 2^17 is too many? Or, why is 2^16 rules per-tier the right tradeoff, as opposed to 2^16 total? (Indeed, ovn-k's implementation actually only allows for 2^14 total, not 2^14 per tier... not sure what they're planning to do with BANP priorities... let alone the possibility of DNP in the future...)

Anyway, if we want say that implementations are allowed to assume that "total number of policies" times "total number of rules per policy" times "total number of tiers" is less than a specific value, and that it is reasonable for implementations to be designed in such a way that they would need a complete rearchitecting and rewrite if we increased that value, then that means we need to decide now what the total maximum number of tiers will be in the future (eg, including "DNP", etc). Because once we hit GA, we can't change validation in a way that would reject previously-valid objects, so we can't lower the range of valid priority levels or the maximum length of the rule arrays.

My impression is that we did not actually intend to say that implementations must support 100,000 rules. Rather, the "1000 policies" and "100 rules per policy" were limits that independently made sense, but we assumed that the 1000*100 space would be filled in sparsely.

Or alternatively, if we did intend to say that implementations must support 100,000 rules (and no more), then we should just have that be the requirement, and not try to enforce this in terms of a specific division of max policies / max rules-per-policy. (And thus, say that implementations can assume it's possible to assign a unique priority from 0 to 99,999 to each rule, but that you can't do that just by multiplying.)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 14, 2025
@aojea
Copy link
Contributor

aojea commented Sep 14, 2025

My impression is that we did not actually intend to say that implementations must support 100,000 rules. Rather, the "1000 policies" and "100 rules per policy" were limits that independently made sense, but we assumed that the 1000*100 space would be filled in sparsely.

oh, I understood this differently, I assumed that implementations will benefit of having a range of max 500 priorities, because that allow to encode the priority in an int16. If that is the cause it sounds reasonable to move from 1000 to 500, AFAIK because 1000 looks an arbitrary number, maybe picked from the existing Endpoints cap? I did not participate so I assumed a lot of things here ... and I agree with Dan that the limits is per field, and this sounds part of the API contract too. Dan as you comment later, you can not make validation stricter later since you'll invalidate existings things

@danwinship
Copy link
Contributor

OK, consensus in yesterday's meeting was that we expect the "rule space" to be very sparse (that is, we assume that most clusters will not use all 1000 priority values, and most policies will not have 100 rules, and most rules will not have 100 peers, and most Networks peers will not have 25 CIDRs, etc).

The priority range is API (since if we lowered it in a stable API version it would invalidate some existing policies, while if we raised it in a stable API version, it would change the semantics of some existing policies, since a priority: 1000 policy would no longer be the final fallback).

However, the MaxItems values in the CRDs are not API (or at least, not bidirectionally; we can't lower them in a stable release, but we can raise them). The existing values are arbitrary, and we mostly agreed that they're probably too high, and will be lowering them. But we will reserve the right to raise them in the future, so implementations can't assume they are permanent maximums.

We also generally agreed that the max priority should probably remain a power of 10, and 100 feels too small, so we plan to stick with 1000.

@danwinship danwinship closed this Oct 8, 2025
@danwinship
Copy link
Contributor

we expect the "rule space" to be very sparse

(so implementations that need to have distinct priority numbers for every policy/rule/peer/whatever should dynamically map them into whatever range they need, and not try to statically map them)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants