Skip to content

Conversation

@XuanYang-cn
Copy link
Contributor

@XuanYang-cn XuanYang-cn commented Nov 28, 2025

Adds KMS key state monitoring and coordinated key rotation to prevent message queue consumption failures during encryption key updates.

Key Changes:

  • Add KeyManager in RootCoord for periodic KMS state polling
  • Integrate KeyManager with QuotaCenter for access denial
  • Implement revocation checks in Proxy SimpleLimiter
  • Add rotation callback coordination via AlterDatabase broadcast
  • Drop internal properties before metadata persistence
  • Add GetStates() and InvalidateCipherCache() to hookutil

Access Denial:

  • Revoked keys: Release collections + deny DML/DQL (DDL still allowed)
  • Check performed on every request at proxy layer
  • Manual LoadCollection required after key recovery

Key Rotation Flow:

  1. CipherPlugin rotates key, writes to etcd
  2. Plugin invokes onKeyRotated callback
  3. KeyManager broadcasts AlterDatabase with internal property
  4. StreamingNode receives message and reloads cipher
  5. ACK callback invalidate Proxy db cache and refresh key

See also: #45117, #44981, #45242

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: XuanYang-cn
To complete the pull request process, please assign tedxu after the PR has been reviewed.
You can assign the PR to them by writing /assign @tedxu in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot added the size/XL Denotes a PR that changes 500-999 lines. label Nov 28, 2025
@mergify mergify bot added dco-passed DCO check passed. kind/feature Issues related to feature request from users labels Nov 28, 2025
@sre-ci-robot
Copy link
Contributor

[ci-v2-notice]
Notice: We are gradually rolling out the new ci-v2 system.

  • Legacy CI jobs remain unaffected, you can just ignore ci-v2 if you don't want to run it.
  • Additional "ci-v2/*" checkers will run for this PR to ensure the new ci-v2 system is working as expected.
  • For tests that exist in both v1 and v2, passing in either system is considered PASS.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-ut-integration // for ci-v2/ut-integration
  • /ci-rerun-ut-go // for ci-v2/ut-go
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp
  • /ci-rerun-e2e-arm // for ci-v2/e2e-arm [master branch only]
  • /ci-rerun-e2e-default // for ci-v2/e2e-default [master branch only]

If you have any questions or requests, please contact @zhikunyao.

@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.73%. Comparing base (4f080bd) to head (db6c871).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #45936       +/-   ##
===========================================
+ Coverage   76.08%   82.73%    +6.64%     
===========================================
  Files        1884      524     -1360     
  Lines      294531    82326   -212205     
===========================================
- Hits       224097    68111   -155986     
+ Misses      63028    14215    -48813     
+ Partials     7406        0     -7406     
Components Coverage Δ
Client ∅ <ø> (∅)
Core 82.73% <ø> (ø)
Go ∅ <ø> (∅)
see 1360 files with indirect coverage changes
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Adds KMS key state monitoring and coordinated key rotation to prevent
message queue consumption failures during encryption key updates.

Key Changes:
- Add KeyManager in RootCoord for periodic KMS state polling
- Integrate KeyManager with QuotaCenter for access denial
- Implement revocation checks in Proxy SimpleLimiter
- Add rotation callback coordination via AlterDatabase broadcast
- Drop internal properties before metadata persistence
- Add GetStates() and InvalidateCipherCache() to hookutil

Access Denial:
- Revoked keys: Release collections + deny DML/DQL (DDL still allowed)
- Check performed on every request at proxy layer
- Manual LoadCollection required after key recovery

Key Rotation Flow:
1. CipherPlugin rotates key, writes to etcd
2. Plugin invokes onKeyRotated callback
3. KeyManager broadcasts AlterDatabase with internal property
4. StreamingNode receives message and reloads cipher
5. ACK callback invalidate Proxy db cache and refresh key

See also: milvus-io#45117, #45981, milvus-io#45242

Signed-off-by: yangxuan <[email protected]>
Signed-off-by: yangxuan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dco-passed DCO check passed. kind/feature Issues related to feature request from users size/XL Denotes a PR that changes 500-999 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants