Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The explanation of "region-concurrency"'s effect on speed for TiDB Lightning (Part 2 §2.2.1) is misleading #915

Open
kennytm opened this issue Aug 20, 2021 · 0 comments

Comments

@kennytm
Copy link
Contributor

kennytm commented Aug 20, 2021

* 举例来说,若一次编码处理耗时 50 毫秒,那么每秒只能进行 20 次编码。若 `block-size` 为 64 KB,则单一 CPU 核每秒最多完成 1.28 MB 数据的编码处理。当 `region-concurrency` 设置为 60,则整体编码处理的极限速度约为每秒 75 MB。

Lightning's encoder operates on rows, not read-blocks. So mentioning "block-size" here is misleading people into thinking increasing read-block-size improves encoding performance, but in fact the factor cancels out with the hidden factor in "一次编码处理"

Better eliminate the block-size and make it like

  • 举例来说,若编码一行处理耗时 0.2 毫秒,那么每秒只能编码 1 ÷ 0.2ms = 5000 行。若一行大小平均为 250 字节,则单一 CPU 核每秒最多完成 250 × 5000 = 1.25 MB 数据的编码处理。当 region-concurrency 设置为 60,则整体编码处理的极限速度约为每秒 1.25 MB × 60 = 75 MB。
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant