-
Notifications
You must be signed in to change notification settings - Fork 441
feat: blocks backup / restore #4950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
Signed-off-by: Norman <[email protected]>
🛠 PR Checks Summary🔴 Pending initial approval by a review team member, or review from tech-staff Manual Checks (for Reviewers):
Read More🤖 This bot helps streamline PR reviews by verifying automated checks and providing guidance for contributors and reviewers. ✅ Automated Checks (for Contributors):🟢 Maintainers must be able to edit this pull request (more info) ☑️ Contributor Actions:
☑️ Reviewer Actions:
📚 Resources:Debug
|
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
| for height := startHeight; height <= endHeight; height++ { | ||
| block := b.store.LoadBlock(height) | ||
| if block == nil { | ||
| return fmt.Errorf("block store returned nil block for height %d", height) | ||
| } | ||
|
|
||
| data, err := amino.Marshal(block) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if err := stream.Send(&backuppb.StreamBlocksResponse{Data: data}); err != nil { | ||
| return err | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ajnavarro coment from #3946 (comment)
Some things to comment on this:
- Fetching blocks one by one from the K/V storage is extremely slow. We need to use iterators for that.
- We are getting byte arrays from the K/V storage, the LoadBlock method is unmarshalling that to a Block struct, after that we marshal again blocks to a byte array. After that, we unmarshal a StreamBlocksResponse into protobuf... That's a lot of unnecessary processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After my first tests:
current implementation:
go run main.go -o blocks-backup -remote http://localhost:4242 6.60s user 1.93s system 67% cpu 12.581 total
go run main.go -o blocks-backup -remote http://localhost:4242 7.14s user 2.08s system 68% cpu 13.543 total (over 78.84k blocks)
I will try to improve it with your suggestions, I'll keep you updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On commit 3382675
We removed the one pair of marshal/unmarshal
and we got:
go run main.go -o blocks-backup -remote http://localhost:4242 3.57s user 2.01s system 40% cpu 13.702 total
Around 2x faster
| if !skipVerification { | ||
| if err := state.Validators.VerifyCommit( | ||
| chainID, firstID, first.Height, second.LastCommit); err != nil { | ||
| return fmt.Errorf("invalid commit (%d:%X): %w", first.Height, first.Hash(), err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @ajnavarro comment #3946 (comment)
Can we add a flag to skip commit verification? If we trust the backup, it can improve import speed.
| } | ||
| } | ||
|
|
||
| bcR.store.SaveBlock(first, firstParts, second.LastCommit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from @ajnavarro comment #3946 (comment):
We have to use insert Batches here. It will be an order of magnitude faster.
Working on top of #3946
Fixes #1827
Description (thanks @n0izn0iz )
It has a single method StreamBlocks that take a start and end height. If end height is 0 it will stream to the latest height. It is disabled by default and require enabling it in the config.toml
The tar format was chosen to bundle blocks since it's widely supported and efficient. The zstandard format was chosen for compression because it's fast, has a good compression ratio and is widely supported.
It will start at the current node height + 1.
The restore command can only restore at backupEndHeight-1 because I did not figure a way to commit block n without block n+1. I'd gladly take ideas on how to do that.
The backup is fast enough for now IMO (< 20min for test5 on my macbook) but can be optimized because it's not parallelized.
The restore bottleneck seems to be the gnovm currently but I would need to profile to be sure.
How to create a backup
cd contribs/tm2backup tm2backup -o blocks-backup -remote http://localhost:4242How to create a node from a backup