Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BatchPreload to decode slabs in parallel and cache #404

Merged
merged 2 commits into from
May 14, 2024

Conversation

fxamacker
Copy link
Member

Updates #394

This PR adds BatchPreload which decodes slabs in parallel and stores decoded slabs in cache for later retrieval.

Migration benchmark (not available yet) will use https://github.com/onflow/flow-go/tree/feature/atree-inlining-cadence-v0.42

Casual microbenchmark (on my busy desktop):

                           │  before.txt  │              after.txt               │
                           │    sec/op    │    sec/op     vs base                │
StorageRetrieve/10-12         36.23µ ± 3%   35.33µ ±  4%        ~ (p=0.075 n=10)
StorageRetrieve/100-12        469.6µ ± 8%   124.3µ ±  0%  -73.52% (p=0.000 n=10)
StorageRetrieve/1000-12       6.678m ± 7%   2.303m ± 20%  -65.51% (p=0.000 n=10)
StorageRetrieve/10000-12      29.81m ± 2%   12.26m ±  5%  -58.86% (p=0.000 n=10)
StorageRetrieve/100000-12    303.33m ± 1%   88.40m ±  1%  -70.86% (p=0.000 n=10)
StorageRetrieve/1000000-12     3.442 ± 1%    1.137 ±  3%  -66.96% (p=0.000 n=10)
geomean                       12.34m        4.816m        -60.98%

                           │  before.txt  │              after.txt              │
                           │     B/op     │     B/op      vs base               │
StorageRetrieve/10-12        21.59Ki ± 0%   21.59Ki ± 0%       ~ (p=1.000 n=10)
StorageRetrieve/100-12       219.8Ki ± 0%   224.7Ki ± 0%  +2.24% (p=0.000 n=10)
StorageRetrieve/1000-12      2.266Mi ± 0%   2.272Mi ± 0%  +0.27% (p=0.000 n=10)
StorageRetrieve/10000-12     21.94Mi ± 0%   22.14Mi ± 0%  +0.91% (p=0.000 n=10)
StorageRetrieve/100000-12    215.3Mi ± 0%   218.5Mi ± 0%  +1.50% (p=0.000 n=10)
StorageRetrieve/1000000-12   2.211Gi ± 0%   2.212Gi ± 0%  +0.05% (p=0.000 n=10)
geomean                      6.919Mi        6.976Mi       +0.82%

                           │ before.txt  │              after.txt               │
                           │  allocs/op  │  allocs/op   vs base                 │
StorageRetrieve/10-12         76.00 ± 0%    76.00 ± 0%       ~ (p=1.000 n=10) ¹
StorageRetrieve/100-12        745.0 ± 0%    759.0 ± 0%  +1.88% (p=0.000 n=10)
StorageRetrieve/1000-12      7.161k ± 0%   7.153k ± 0%  -0.11% (p=0.000 n=10)
StorageRetrieve/10000-12     70.77k ± 0%   70.58k ± 0%  -0.27% (p=0.000 n=10)
StorageRetrieve/100000-12    711.9k ± 0%   709.7k ± 0%  -0.31% (p=0.000 n=10)
StorageRetrieve/1000000-12   7.115M ± 0%   7.077M ± 0%  -0.54% (p=0.000 n=10)
geomean                      22.93k        22.95k       +0.11%

  • Targeted PR against main branch
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work
  • Code follows the standards mentioned here
  • Updated relevant documentation
  • Re-reviewed Files changed in the Github PR explorer
  • Added appropriate labels

@fxamacker fxamacker added enhancement New feature or request performance labels May 9, 2024
@fxamacker fxamacker self-assigned this May 9, 2024
@fxamacker fxamacker force-pushed the fxamacker/add-nondeterministic-fast-commit branch from cfb364c to 7162eab Compare May 9, 2024 15:01
@fxamacker fxamacker force-pushed the fxamacker/add-batch-preload branch 2 times, most recently from 9b5be88 to d6baae9 Compare May 9, 2024 17:33
The intended use for BatchPreload is to speedup migrations.

BatchPreload decodes slabs in parallel and stores decoded
slabs in cache for later retrieval.  This is useful for
migration program when most or all slabs are expected to
be migrated.

                           │  before.txt  │              after.txt               │
                           │    sec/op    │    sec/op     vs base                │
StorageRetrieve/10-12         36.23µ ± 3%   35.33µ ±  4%        ~ (p=0.075 n=10)
StorageRetrieve/100-12        469.6µ ± 8%   124.3µ ±  0%  -73.52% (p=0.000 n=10)
StorageRetrieve/1000-12       6.678m ± 7%   2.303m ± 20%  -65.51% (p=0.000 n=10)
StorageRetrieve/10000-12      29.81m ± 2%   12.26m ±  5%  -58.86% (p=0.000 n=10)
StorageRetrieve/100000-12    303.33m ± 1%   88.40m ±  1%  -70.86% (p=0.000 n=10)
StorageRetrieve/1000000-12     3.442 ± 1%    1.137 ±  3%  -66.96% (p=0.000 n=10)
geomean                       12.34m        4.816m        -60.98%

                           │  before.txt  │              after.txt              │
                           │     B/op     │     B/op      vs base               │
StorageRetrieve/10-12        21.59Ki ± 0%   21.59Ki ± 0%       ~ (p=1.000 n=10)
StorageRetrieve/100-12       219.8Ki ± 0%   224.7Ki ± 0%  +2.24% (p=0.000 n=10)
StorageRetrieve/1000-12      2.266Mi ± 0%   2.272Mi ± 0%  +0.27% (p=0.000 n=10)
StorageRetrieve/10000-12     21.94Mi ± 0%   22.14Mi ± 0%  +0.91% (p=0.000 n=10)
StorageRetrieve/100000-12    215.3Mi ± 0%   218.5Mi ± 0%  +1.50% (p=0.000 n=10)
StorageRetrieve/1000000-12   2.211Gi ± 0%   2.212Gi ± 0%  +0.05% (p=0.000 n=10)
geomean                      6.919Mi        6.976Mi       +0.82%

                           │ before.txt  │              after.txt               │
                           │  allocs/op  │  allocs/op   vs base                 │
StorageRetrieve/10-12         76.00 ± 0%    76.00 ± 0%       ~ (p=1.000 n=10) ¹
StorageRetrieve/100-12        745.0 ± 0%    759.0 ± 0%  +1.88% (p=0.000 n=10)
StorageRetrieve/1000-12      7.161k ± 0%   7.153k ± 0%  -0.11% (p=0.000 n=10)
StorageRetrieve/10000-12     70.77k ± 0%   70.58k ± 0%  -0.27% (p=0.000 n=10)
StorageRetrieve/100000-12    711.9k ± 0%   709.7k ± 0%  -0.31% (p=0.000 n=10)
StorageRetrieve/1000000-12   7.115M ± 0%   7.077M ± 0%  -0.54% (p=0.000 n=10)
geomean                      22.93k        22.95k       +0.11%
@fxamacker fxamacker force-pushed the fxamacker/add-batch-preload branch from d6baae9 to a459d96 Compare May 9, 2024 18:11
@fxamacker
Copy link
Member Author

Looks like validation and storage health checks passed tonight with this PR added to atree migration program at:

Need to take another look at logs to confirm.

@fxamacker
Copy link
Member Author

@ramtinms @turbolent PTAL 🙏

storage.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ramtinms ramtinms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, only consideration is maybe just add some comments or warning about when to use this method.

@fxamacker fxamacker force-pushed the fxamacker/add-batch-preload branch from f521009 to aa2ee90 Compare May 13, 2024 20:37
Copy link
Member

@turbolent turbolent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea and nice work! 👏

@fxamacker fxamacker changed the base branch from fxamacker/add-nondeterministic-fast-commit to feature/array-map-inlining May 14, 2024 14:17
@fxamacker fxamacker merged commit 73e00ec into feature/array-map-inlining May 14, 2024
3 checks passed
@turbolent
Copy link
Member

@fxamacker Does this also have to get ported to the 1.0 branch?

@fxamacker
Copy link
Member Author

@fxamacker Does this also have to get ported to the 1.0 branch?

@turbolent I will port it for completeness, but it is only needed if we use the optimization for Cadence 1.0 migration without atree inlining.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants