-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine domain payloads and provide on-the-fly migration #3664
Combine domain payloads and provide on-the-fly migration #3664
Conversation
This commit: 1. Combines all domain (non-atree) payloads into one account (non-atree) payload per account. 2. Combines all domain (atree) storage maps into one account (atree) storage map per account. 3. Uses on-the-fly (OTF) migration when an account is modified (write ops). Currently, accounts store data on-chain under pre-defined domains, such as "storage". Each domain requires domain payload (8-byte non-atree payload) and domain storage map payload (atree payload). Also, each payload requires ~2 mtrie nodes (~2x96 byte overhead). New domains were added in Cadence 1.0 and domain payloads count increased to 150 million on Sept. 4 (was 80 million pre-spork). Nearly 25% of total payloads on-chain are 8-byte domain payloads. Each account on mainnet has an average of ~4 domain payloads and ~4 domain storage maps. This commit creates 1 account (non-atree) payload and 1 account (atree) storage map per account, eliminating all domain (non-atree) payloads and domain storage maps for that given account. Based on preliminary estimates using Sept. 17, 2024 mainnet state, this approach can: - eliminate mtrie nodes: -425 million (-28.5%) - reduce payload count: -174 million (also -28.5%) This commit also includes on-the-fly migration so we can see improvements to accounts that have write activity without requiring downtime. Given this, we won't see the full benefits/impact until all accounts (including idle accounts) are eventually migrated (e.g. using full migration or other means).
Cadence Benchstat comparisonThis branch with compared with the base branch onflow:feature/combine-domain-payloads-and-domain-storage-maps commit a56e521 Collapsed results for better readability
|
This is for completeness.
To test the impact of this PR, I created a full migration program yesterday and ran it on recent mainnet data (Nov 1, 2024 state). The full migration program on a test vm produced expected results for reducing number of payloads and mtrie nodes by 28.7% each. 🎉 The plan is to deploy this PR as HCU and use its on-the-fly migration, so we won't see full impact until all accounts are eventually migrated (including idle and read-only accounts). Also ran storage health check and diff-states after migration on Nov 4, 2024 with no problems detected. |
@turbolent and Cadence team, many thanks for code review session today! Next steps (as discussed):
I will also cleanup and share the private gist I presented in the meeting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I think we can maybe merge this into the feature branch to start with, and then we can open follow-up PRs as discussed, and finally have a review of the final feature branch when merging it into master
b13fdba
into
feature/combine-domain-payloads-and-domain-storage-maps
Closes #3584
Description
This PR:
NOTE: This requires HCU to deploy. Full impact (e.g. eliminating 425+ million mtrie nodes, etc.) won't be seen until all accounts (including idle accounts with no write ops) are eventually migrated.
Context
Currently, accounts store data on-chain under pre-defined domains, such as "storage". Each domain requires domain payload (8-byte non-atree payload) and domain storage map payload (atree payload). Also, each payload requires ~2 mtrie nodes (~2x96 byte overhead).
New domains were added in Cadence 1.0 and domain payloads count increased to 150 million on Sept. 4 (was 80 million pre-spork). Nearly 25% of total payloads on-chain are 8-byte domain payloads. Each account on mainnet has an average of ~4 domain payloads and ~4 domain storage maps.
Solution
This PR creates 1 account (non-atree) payload and 1 account (atree) storage map per account, eliminating all domain (non-atree) payloads and domain storage maps for that given account.
Based on preliminary estimates using Sept. 17, 2024 mainnet state, this approach can:
This commit also includes on-the-fly migration so we can see improvements to accounts that have write activity without requiring downtime. Given this, we won't see the full benefits/impact until all accounts (including idle accounts) are eventually migrated (e.g. using full migration or other means).
Full Migration Test Results (Nov 4, 2024)
Full migration (on test vm) using this PR produced expected results:
Storage health check and diff-states on the migrated state did not detect any problems. 🎉
NOTE: On-the-fly migration will not produce full impact until all accounts (including idle and read-only accounts) are eventually migrated.
Initial Code Review Session (Nov 5, 2024)
We agreed on-the-fly migration is feasible because migrating each account avoids unbounded cost: accounts average 4.3 domains each and we can only migrate a max of 9 domains.
No bugs or showstoppers were found in the PR (yet) but the code can be refactored (e.g. more decoupling to reduce complexity). I used up ~12 out of 15 devdays to open PR and run full migration test.
Great suggestion by Bastian: it would be OK to replace domain name string with integer.
Bastian also suggested looking into possibility of avoiding HCU by using feature flag.
We agreed to create feature branch for this PR and open additional PRs to feature branch.
TODO
Next:
master
branchFiles changed
in the Github PR explorer