runtime/Optimize accounts loading (one pass)#1157
runtime/Optimize accounts loading (one pass)#1157HaoranYi wants to merge 3 commits intoanza-xyz:masterfrom
Conversation
21b2fd7 to
985c590
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1157 +/- ##
=========================================
- Coverage 82.7% 82.7% -0.1%
=========================================
Files 876 876
Lines 371130 371181 +51
=========================================
+ Hits 307273 307314 +41
- Misses 63857 63867 +10 |
62a6488 to
e15a1a9
Compare
| // programs in top-level instructions and not passed as instruction accounts. | ||
| let program_account = account_shared_data_from_program(&program); | ||
| (program.account_size, program_account, 0) | ||
| } else { |
There was a problem hiding this comment.
This optimization, which set program account data to empty, is not applicable
any more because we no longer prime program cache before loading the accounts.
There is also a change on the reported execution errors when a transaction is
called for uninitialized or buffer program accounts. The original code will
throw "InvalidAccountData" error, while this PR will throw
"InvalidProgramForExecution" error. The reason is that original code will mark
the program account executable and failed on a later check when execute the
program. This PR will fail on executable check in this function early on.
Fortunately, this change won't affect consensus because program errors are not
part of the consensus.
7658cc1 to
e0d2ab3
Compare
|
I think the massive increase in account load time can be explained by the fact that we are still loading missing program accounts twice. And now the second read is not in the read cache combined with the fact that program accounts are big cause this to be a lot. However, this would only make sense if it is roughly proportional to the program cache misses. Can you add program cache misses to the account load time plot so we can see if it correlates? |
|
Hm, problem is that the account load time has a very high base line while the program cache misses have a very low base line. So it can't explain it completely and there must be some other effect going on. |
|
In master, we are doing a filtering pass on the accounts before actually loading the account. The filtering pass could be "prefetch" the accounts into memory by os, which benefit the loading. The filtering timing is counted into prog_cache stats in master, but no longer exists for this PR. That's why we see a big decrease in prog_cache time. In this PR, we are not filtering and loading the accounts directly. We don't have the benefit of prefetching. Probably, that's why aaccount load_us is much higher than the base line. |
|
So, in other words it is the same time spent, just measured / counted in a different category. Makes sense. |
svm/src/account_loader.rs
Outdated
| if message.is_writable(i) { | ||
| .load_account_with(key, |account| { | ||
| // only cache account that is not writable and not having program owners. | ||
| !is_account_writable && !program_owners.contains(account.owner()) |
There was a problem hiding this comment.
Can you test how much performance changes if you remove the !is_account_writable &&? I suspect it would be barely measurable, as program management instructions are quite rare. Also, the writeability should not determine if it goes into the read cache or not.
There was a problem hiding this comment.
Can you run one more test with .load_account_with(key, |account| false), effectively disabling the read caching altogether?
There was a problem hiding this comment.
So total replay just gets a little more variance with the read caching enabled.
But you did not disable it in both places where load_account_with() is used and missing program loading still does the read cached load right?
There was a problem hiding this comment.
Both places are disabled.
The missing program loading already have the callback to return false.
There was a problem hiding this comment.
So where else is the read cache populated then? The blue read cache hits graph is not zero.
There was a problem hiding this comment.
There are many other places that load accounts and populate read cache besides transaction processor.
- programdata account
- program owner account
- sysvar account
- vote account
- feature account
- buildin account
- nounce account
- fee account
- withdraw/deposit
The following commit lists several places that we populate read cache.
cc2bacc
Some of them should be optimized to skip populating read cache obviously. For example, vote account, and sysvar accounts and fee paying accounts. And some may need more context, i.e. nounce accounts?
I think this change is out of scope of this PR. Probably, we need another project to review all the places that we load accounts and decide whether it is valuable to store the accounts into the cache.
It is unfortunate that we choose the default to put the account into read cache when loading them. Probably, we should have defaulted to not putting account into read cache.
There was a problem hiding this comment.
#1405 will avoid inserting the accounts into read ache for items 8/9.
|
@HaoranYi ping? are you going to attempt #1408 (comment)? |
| if message.is_writable(i) { | ||
| .load_account_with(key, |_account| { | ||
| // cache the account in read-only cache | ||
| true |
Yes, this is addressed. |
cdc896b to
81bf453
Compare
| assert_eq!(account.lamports(), 0); | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
The following tests are adapted from corresponding tests in transaction_processor.rs, and moved here, since we get rid of fn filter_program_accounts() from transaction_processor.rs.
dae3892 to
67c17ea
Compare
|
Rebase the PR on top of master to pick up recent svm changes. |
|
I'm profiling this now, ideally I'd like to get it in before we branch 2.0 |
8391d59 to
d977050
Compare
|
rebase to master to pick up audit fixes. |
d977050 to
5764c5f
Compare
|
rebase again. |
053166b to
8098948
Compare
8098948 to
50d4708
Compare
|
Rebase on master. |
50d4708 to
0ac1a61
Compare
|
#6036 is what we are currently aiming for |
|
Closing as #6036 has landed. |






Problem
During transaction execution, we load accounts twice - first during filtering program accounts, and second for loading the actual accounts before execution. And by default, loading program accounts will insert the loaded program accounts into read cache, which pollutes read cache unnecessarily. This is inefficient and can be optimized.
This is an alternative idea for #1056 and #1180
Summary of Changes
load_with(), which takes a callback to indicate whether to insert the loaded account intoread_cache.read_cacheafter loading.Performance Comparision
Account loading time increased, program cache time decreased. The decrease of program cache time is more than the increase of accounts loading time. Total replay time decreased.
Fixes #