Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schedule: impl balance range scheduler #9005

Merged
merged 25 commits into from
Mar 5, 2025

Conversation

bufferflies
Copy link
Contributor

@bufferflies bufferflies commented Jan 17, 2025

What problem does this PR solve?

Issue Number: Close #9006

What is changed and how does it work?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)

Code changes

Side effects

Related changes

Release note

None.

Signed-off-by: 童剑 <[email protected]>
Signed-off-by: 童剑 <[email protected]>
Signed-off-by: 童剑 <[email protected]>
Signed-off-by: 童剑 <[email protected]>
Signed-off-by: 童剑 <[email protected]>
Signed-off-by: 童剑 <[email protected]>
Copy link
Contributor

ti-chi-bot bot commented Jan 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates the PR's author has signed the dco. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed do-not-merge/needs-linked-issue labels Jan 17, 2025
@ti-chi-bot ti-chi-bot bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jan 17, 2025
@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch 2 times, most recently from 50d905f to 7f0ed67 Compare February 13, 2025 08:33
@bufferflies
Copy link
Contributor Author

/test pull-integration-realcluster-test

Copy link
Contributor

ti-chi-bot bot commented Feb 17, 2025

@bufferflies: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test build
/test pull-integration-realcluster-test

The following commands are available to trigger optional jobs:

/debug pull-unit-test
/test pull-integration-copr-test

Use /test all to run the following jobs that were automatically triggered:

tikv/pd/ghpr_build
tikv/pd/pull_integration_realcluster_test

In response to this:

/test pull-integration-realcluster-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Signed-off-by: 童剑 <[email protected]>
@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch from 120a2bf to 2854acd Compare February 17, 2025 08:54
}

opInfluence := s.OpController.GetOpInfluence(cluster.GetBasicCluster(), operator.WithRangeOption(job.Ranges))
plan, err := s.prepare(cluster, opInfluence, job)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For every time we schedule, it will scan the range from the beginning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the info on the regions and stores in the given ranges may be updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it affect the other requests like get region or heartbeat if the range is large?

Copy link
Contributor Author

@bufferflies bufferflies Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the operator of scanning regions acquires Rlock which shares with the region heartbeat. We could reuse the previous distribution by interval duration, not every scheduler.

@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch from 1628d72 to 40f0f56 Compare February 18, 2025 09:44
Signed-off-by: 童剑 <[email protected]>
@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch from 40f0f56 to 246aaa6 Compare February 19, 2025 02:20
@bufferflies bufferflies requested a review from nolouch February 19, 2025 02:49
balanceRangeCounter.Inc()
job := s.conf.peek()
if job == nil {
balanceRangeNoJobCounter.Inc()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we need not to count this. It is enough that set it true when there is no job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will count if all the jobs are finished.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using RunningJobCounter rather than NoJobCounter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if conf.peek() is nil, it indicates that all the jobs in the configuration are finished, so the metrics just tell DBA that there is no unable job in this scheduler.

now := time.Now()
job.Start = &now
job.Status = running
if err := conf.save(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not recommended that

  1. modify the input param
  2. call conf.save() to persist the modified input struct

I prefer more explicit implement. For example, we can use jobs map[JobID]balanceRangeSchedulerJob and input JobID here rather than Job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rleungx the functions must ensure that jobs list must contains the given job.


targetInfluence := p.opInfluence.GetStoreInfluence(p.targetStoreID())
targetInf := p.job.Role.getStoreInfluence(targetInfluence)
if targetInf < 0 {

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if the store has some remove-peer operators

This comment was marked as resolved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must consider the scenario involving removing-peer , so the targeting is negative, the target score is small. This causes too many peers to be added to this store.

}
targetScore := p.targetScore + targetInf + p.tolerate

shouldBalance := sourceScore >= targetScore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some comments to explan the principle? Why is the score calculated this way? What is the score difference from the balanceRegionScheduler? How are conflicts between different balance schedulers handled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The balanceRegionScheduler and balanceLeaderScheduler only consider the global regions not the given key ranges, so the store status can calculate their score. But the balance-range-scheduler is different, the given key ranges should set it

@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch 3 times, most recently from ccaa591 to 2e61a74 Compare February 26, 2025 07:26
@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Feb 26, 2025
@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch from 2e61a74 to 132d854 Compare March 4, 2025 08:21
func (conf *balanceRangeSchedulerConfig) begin(index int) *balanceRangeSchedulerJob {
conf.Lock()
defer conf.Unlock()
job := conf.jobs[index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we have already had the job, why do we need to get it again?

func (conf *balanceRangeSchedulerConfig) finish(index int) *balanceRangeSchedulerJob {
conf.Lock()
defer conf.Unlock()
job := conf.jobs[index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

switch r {
case leader:
return influence.LeaderCount
case follower:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it conflict with balance region?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes,the next MR will solve it

Signed-off-by: 童剑 <[email protected]>
@bufferflies bufferflies force-pushed the balance_key_range_scheduler branch from 132d854 to 5795f44 Compare March 5, 2025 06:52
Signed-off-by: 童剑 <[email protected]>
Copy link
Member

@rleungx rleungx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM

@@ -463,7 +463,7 @@ func (s *balanceRangeScheduler) prepare(cluster sche.SchedulerCluster, opInfluen
averageScore: averageScore,
job: job,
opInfluence: opInfluence,
tolerate: tolerate,
tolerate: tolerantSizeRatio,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tolerantSizeRatio

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 5, 2025
Copy link
Contributor

ti-chi-bot bot commented Mar 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: okJiang, rleungx

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

ti-chi-bot bot commented Mar 5, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-02-26 09:39:20.184004847 +0000 UTC m=+435108.137163113: ☑️ agreed by okJiang.
  • 2025-03-05 08:18:08.724748243 +0000 UTC m=+429601.853667984: ☑️ agreed by rleungx.

@ti-chi-bot ti-chi-bot bot added the approved label Mar 5, 2025
@ti-chi-bot ti-chi-bot bot merged commit bdd857e into tikv:master Mar 5, 2025
23 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scheduler: implement balance range scheduler algorithm
3 participants