-
Notifications
You must be signed in to change notification settings - Fork 844
feat(evm): add database with separate storage for block data #4485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
234838d to
03fde43
Compare
ef6921e to
d8ee531
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a new blockdb.Database implementation that separates EVM block data (headers, bodies, receipts) from the shared KV store into dedicated height-indexed databases. This optimization aims to reduce KV store size and compaction overhead.
Key changes:
- Implements routing logic to store block data at/above a minimum height in height-indexed databases while maintaining backward compatibility via fallback reads
- Adds background migration to move existing canonical block data from the KV store to height-indexed databases
- Provides deferred initialization support for scenarios like state sync where the minimum block height isn't initially known
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| vms/evm/database/blockdb/database.go | Core database implementation with key routing, read/write operations, and batch support |
| vms/evm/database/blockdb/migrator.go | Background migration logic with progress tracking, compaction, and pause/resume capabilities |
| vms/evm/database/blockdb/database_test.go | Comprehensive tests for database operations, initialization scenarios, and edge cases |
| vms/evm/database/blockdb/migration_test.go | Tests for migration completion, resumption, and data integrity during migration |
| vms/evm/database/blockdb/helpers_test.go | Test utilities for creating blocks, comparing data, and controlling migration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d8ee531 to
10b2fbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
10b2fbe to
7d7cb8b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
vms/evm/database/blockdb/database_test.go:1
- [nitpick] The comment uses 'blocks 1-3' but the code implements i >= 1 && i <= 3 without parentheses. While operator precedence makes this correct, adding parentheses would improve clarity:
migrated := (i >= 1) && (i <= 3).
// Copyright (C) 2019-2025, Ava Labs, Inc. All rights reserved.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
3cad123 to
2c6713d
Compare
| stateDB database.Database, | ||
| evmDB ethdb.Database, | ||
| dbPath string, | ||
| allowDeferredInit bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why might someone want to defer? Why would the consumer know the minimum block height later, but not now? I'm thinking that this could instead be defaultMinimBlockHeight uint64, which you use instead of 1 when the current argument is false.
The InitBlockDBs method could then be unexported and the heightsDBReady bool field (which feels like a race-condition waiting to happen) removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for context, I added this to address the state sync use case.
Currently, evm database is created on vm initialization. At that point, if state sync is enabled, we don't know what the height of the first block we will be fetching until we get the state sync summary. While we can just set min height to 1, this allows us to init the block dbs once we know the first block height we need to store is and call InitBlockDBs manually (See the an example here)
I'm open to alternative solutions. The main goal here is to being able to set the min height of the dbs to the first non genesis block we need when state sync is enabled to reduce the size of the db index file. Alternatively, we can also not do that and just use min height = 1. This will lead to the db index files being few GBs larger though.
vms/evm/database/blockdb/database.go
Outdated
| _ ethdb.Database = (*Database)(nil) | ||
| _ ethdb.Batch = (*batch)(nil) | ||
|
|
||
| migratorDBPrefix = []byte("migrator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to prefix our prefixes to guarantee that they never clash with a geth value? Like "ava-migrator"?
Is there a downside to doing that? FWIW they use b for block bodies and blt- for something else so we don't need to worry about a substring clash and starting with a is (at first glance) safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the prefix on the stateDB, which is different than the evmDB, which contains the evm data. The evmDB currently has the ethedb prefix and clients should be passing a different prefixed db when creating the blockdb.Database (See a draft usage of this database usage here)
Current convention seems to be having the caller creating the prefixdb. But alternatively, we can also just wrap the stateDB with prefix db in New if you think that might be clearer.
vms/evm/database/blockdb/database.go
Outdated
| // Since the prefixes should never be changed, we can avoid libevm changes by | ||
| // duplicating them here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
Just noting in case any other reviewer asks me about libevm.
| customtypes.Register() | ||
| params.RegisterExtras() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed by the core.GenerateChainWithGenesis call in createBlocksToAddr in helpers_test.go.
Without them we get a nil pointer error on coreth/params.GetExtra and coreth/plugin/evm/customtypes.GetHeaderExtra
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated block creation to not use core.GenerateChainWithGenesis so we can remove this and also not import from graft/coreeth.
2561499
| "github.com/ava-labs/avalanchego/database/prefixdb" | ||
| "github.com/ava-labs/avalanchego/utils/logging" | ||
|
|
||
| heightindexdb "github.com/ava-labs/avalanchego/x/blockdb" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for reviewers: x/blockdb is planned to be moved to database/heightindexdb (see #4520).
I renamed it here to avoid confusing this evm database package (blockdb) with the current x/blockdb package.
2c6713d to
e4d9473
Compare
b570207 to
080b1a5
Compare
Why this should be merged
Adds a
ethdb.Databaseimplementation that moves EVM block data out of the shared KV store into dedicated height-indexed databases optimized for block storage, reducing KV store size and compaction cost.heightindexdb; all other data remains in the underlying KV store.How this works
The
blockdb.Databasewrapsethdb.Databaseand routes block data by key prefix, while leaving non-block data in the KV store.allowDeferredInit, initialization can wait until the minimum block height is known (e.g., when state sync is enabled). Database min height is persisted and cannot be changed without recreating the block databases.Migration
Separated to #4771
How this was tested
blockdbenabled running for >4 weeks.Need to be documented in RELEASES.md?
No