-
Notifications
You must be signed in to change notification settings - Fork 3.2k
fix: fixing exclude region inclusion #43602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes a bug where customer-provided excluded regions were not being honored during transient failures when the SDK calls the global endpoint, which could point to an excluded region. It also adds logging enhancements to provide proper context when regions are marked unavailable.
Key changes:
- Modified region selection logic to properly respect both user-excluded and circuit-breaker-excluded locations
- Added context parameters to endpoint unavailability tracking methods for better diagnostics
- Updated retry policy to calculate retries based on applicable (non-excluded) regions
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test_service_request_retry_policy_async.py | New async test verifying write failover respects excluded regions during service request errors |
| test_service_request_retry_policy.py | New sync test verifying write failover respects excluded regions during service request errors |
| test_location_cache.py | Added tests for excluded region handling, regional fallback behavior, and global endpoint fallback scenarios |
| _global_endpoint_manager_async.py | Added context parameter to endpoint unavailability tracking methods and their call sites |
| _service_request_retry_policy.py | Updated retry count calculation to use applicable (non-excluded) regional routing contexts and added context to mark_endpoint_unavailable calls |
| _location_cache.py | Refactored region filtering logic to properly separate user-excluded from circuit-breaker-excluded locations and respect exclusions in global endpoint resolution |
| _global_endpoint_manager.py | Added context parameter to endpoint unavailability tracking methods and their call sites |
| _endpoint_discovery_retry_policy.py | Added context parameter when marking endpoints unavailable |
sdk/cosmos/azure-cosmos/tests/test_service_request_retry_policy_async.py
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/tests/test_service_request_retry_policy.py
Outdated
Show resolved
Hide resolved
sdk/cosmos/azure-cosmos/tests/test_service_request_retry_policy_async.py
Outdated
Show resolved
Hide resolved
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run python - cosmos - tests |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR addresses scenarios where customer provided excluded region was not being honored during certain transient failures as the SDK ends up calling global endpoint which can point to a region that is part of excluded region list.
It also adds logging enhancements to ensure when a region is marked unavailable we have the proper context.