Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: TSO proxy will cause pd follower OOM #9004

Closed
guoxiangCN opened this issue Jan 16, 2025 · 8 comments · Fixed by #9009
Closed

Bug: TSO proxy will cause pd follower OOM #9004

guoxiangCN opened this issue Jan 16, 2025 · 8 comments · Fixed by #9009
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. impact/oom severity/major type/bug The issue is confirmed as a bug.

Comments

@guoxiangCN
Copy link

Bug Report

What did you do?

Enable tso proxy, the pd follower will OOM in several days.

What did you expect to see?

pd follower works well.

What did you see instead?

the pd follower will OOM in several days (2~5).

What version of PD are you using (pd-server -V)?

pd 5.4.3, but master also have this trouble.

@guoxiangCN guoxiangCN added the type/bug The issue is confirmed as a bug. label Jan 16, 2025
@guoxiangCN
Copy link
Author

guoxiangCN commented Jan 16, 2025

The problematic code is here: https://github.com/tikv/pd/blob/master/pkg/utils/tsoutil/tso_dispatcher.go#L72

val, loaded := s.dispatchChs.LoadOrStore(req.getForwardedHost(), make(chan Request, maxMergeRequests))

every foward tso request will make a channel, which need 80KB!
maybe we should use a RWMutex+map or a map implementation with LoadOrCompute instead of sync.Map

@guoxiangCN
Copy link
Author

guoxiangCN commented Jan 16, 2025

PTAL @rleungx @disksing

@guoxiangCN
Copy link
Author

Image
Image
Image

@rleungx
Copy link
Member

rleungx commented Jan 17, 2025

Introduced by #4085. The affected version starts from v5.3.0.

@guoxiangCN
Copy link
Author

Introduced by #4085. The affected version starts from v5.3.0.

thanks for your reply! any repair plan ? Is the 5. x version still under maintenance, Is there a patch version that will continue to be released in the future?

@rleungx
Copy link
Member

rleungx commented Jan 17, 2025

@guoxiangCN I have raised a PR to fix it. For TiDB EOL, please see https://www.pingcap.com/tidb-release-support-policy/ for detailed information. Once the PR is merged, we may cherry-pick the fix for other release branches, but 5.x may not.

@guoxiangCN
Copy link
Author

@rleungx it seems that v5.4 supports unitl 2025-02-15, If the official can release a fixed version, that would be great!

@rleungx
Copy link
Member

rleungx commented Jan 17, 2025

@rleungx it seems that v5.4 supports unitl 2025-02-15, If the official can release a fixed version, that would be great!

We prefer to upgrade to a new version for better support.

@rleungx rleungx added affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. and removed may-affects-5.4 may-affects-6.1 may-affects-6.5 may-affects-7.1 may-affects-7.5 may-affects-8.1 may-affects-8.5 labels Jan 17, 2025
@rleungx rleungx added affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. labels Jan 17, 2025
ti-chi-bot bot added a commit that referenced this issue Jan 17, 2025
close #9004

Signed-off-by: Ryan Leung <[email protected]>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-8.1 This bug affects the 8.1.x(LTS) versions. affects-8.5 This bug affects the 8.5.x(LTS) versions. impact/oom severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants