-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integrate circuitbreaker for region calls #1543
Conversation
PTAL @rleungx |
/cc @okJiang |
@rleungx: GitHub didn't allow me to request PR reviews from the following users: okJiang. Note that only tikv members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
FYI, I have updated the pd client version through #1542 |
Please fix CI |
Signed-off-by: artem_danilov <[email protected]>
7dde728
to
01fadf5
Compare
Signed-off-by: artem_danilov <[email protected]>
Signed-off-by: artem_danilov <[email protected]>
/retest |
@Tema: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/ok-to-test |
@Tema: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
internal/locate/region_cache.go
Outdated
@@ -133,6 +134,20 @@ func nextTTL(ts int64) int64 { | |||
return ts + regionCacheTTLSec + jitter | |||
} | |||
|
|||
var pdRegionMetaCircuitBreaker = circuitbreaker.NewCircuitBreaker( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we move it to config
? @MyonKeminta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine to me.
But if you are not going to add any fields to the config file, I think maybe it would be better to add another single file in the internal/locate
directory for holding this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I extracted it into TiKVClient config. Tests and integration for TiKVClient toml files are in pingcap/tidb#58737.
I think maybe it would be better to add another single file in the internal/locate directory for holding this variable.
Can you elaborate a bit more on that? Would it be a go file with single like like:
var pdRegionMetaCircuitBreaker = circuitbreaker.NewCircuitBreaker("region-meta", circuitbreaker.AlwaysClosedSettings)
or what do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind. I just feel it still makes me kind of incomfortable if it's put in the config
package while nothing of it is configurable by config file. Now it looks fine to me.
Signed-off-by: artem_danilov <[email protected]>
Signed-off-by: artem_danilov <[email protected]>
config/client.go
Outdated
@@ -149,6 +151,23 @@ type CoprocessorCache struct { | |||
AdmissionMinProcessMs uint64 `toml:"admission-min-process-ms" json:"-"` | |||
} | |||
|
|||
// CircuitBreakerSettings is the config for default circuit breaker settings excluding the error rate | |||
type CircuitBreakerSettings struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about referring to config/retry/backoff.go
? We can use NewCircuitBreakerWithVars or something like that and only add a var for the error rate threshold. The other configs can be hard coded and we don't need to support changing them through the config file. @Tema @MyonKeminta @okJiang WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is problematic. Backoff is a request scope object so we can create it from vars all the time, however CircuitBreaker is an instance scope, which is created on startup and aggregates stats across all requests. Any concerns from propagating error rate directly from system variables like pingcap/tidb#58737 proposes?
As for all other configs, I think it is still make sense to keep them configurable as e.g. MinQPSToOpen
could depend on the cluster size and might need tuning per workload. But I don't feel too strong about that and can rollback the last change. Otherwise, I will move them to PDClient section instead of TiKVClient as they belong there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about propagating the error rate instead of the whole Settings
? As for the other configs, as we discussed before, we can make them hard-coded temporarily. If we do need to change them, then we can make them configurable. Right now, it's better to hide them to reduce the complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't propagate the full settings. Circuit breaker has a callback to modify any part of the settings:
func (cb *CircuitBreaker) ChangeSettings(apply func(config *Settings)) {
apply(cb.config)
}
so in pingcap/tidb#58737 I propagate only error rate through system variables:
func (do *Domain) changePDMetadataCircuitBreakerErrorRateThresholdPct(errorRatePct uint32) {
tikv.ChangePdRegionMetaCircuitBreakerSettings(func(config *circuitbreaker.Settings) {
config.ErrorRateThresholdPct = errorRatePct
})
I would like to keep this API as it allows easily to propagate any other part of the setting if needed, but not required so.
I will remove everything from TiKVClient config back to hardcoded values and keep propagating only error rate system variable as it was in the original version of this PR. Let me know if there is still any concern with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, /cc @okJiang
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I removed all configurations except sysvar from this PR and pingcap/tidb#58737
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems you forgot to update L2218 and L2283?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally I thought it is not needed as they low qps and didn't want to affect GC unnecessary, but it is probably better to throttle them as well. Just added them.
Signed-off-by: artem_danilov <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest LGTM
internal/locate/region_cache.go
Outdated
} | ||
|
||
// ChangePdRegionMetaCircuitBreakerSettings changes circuit breaker changes for region metadata calls | ||
func ChangePdRegionMetaCircuitBreakerSettings(apply func(config *circuitbreaker.Settings)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func ChangePdRegionMetaCircuitBreakerSettings(apply func(config *circuitbreaker.Settings)) { | |
func ChangePDRegionMetaCircuitBreakerSettings(apply func(config *circuitbreaker.Settings)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed
tikv/region.go
Outdated
@@ -197,6 +198,11 @@ func SetRegionCacheTTLSec(t int64) { | |||
locate.SetRegionCacheTTLSec(t) | |||
} | |||
|
|||
// ChangePdRegionMetaCircuitBreakerSettings changes circuit breaker settings for region metadata calls | |||
func ChangePdRegionMetaCircuitBreakerSettings(apply func(config *circuitbreaker.Settings)) { | |||
locate.ChangePdRegionMetaCircuitBreakerSettings(apply) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
Signed-off-by: artem_danilov <[email protected]>
Signed-off-by: artem_danilov <[email protected]>
@okJiang: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MyonKeminta, okJiang, rleungx The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: ref tikv/pd#8678
What is changed and how does it work?
Add a global circuit breaker to the context of each call to get region data from PD.
The default circuit breaker is in the disabled state and has no effect.
This PR also exposes a method to tweak circuit breaker settings which can be used from TiDB layer to enable and configure it.