Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[repostore] Retrieve empty blocks #513

Merged
merged 3 commits into from
Aug 21, 2023
Merged

Conversation

emizzle
Copy link
Contributor

@emizzle emizzle commented Aug 14, 2023

Empty blocks were not retrievable by the RepoStore, and this was causing integration tests to fail when K > 1, as empty blocks were used for padding protected manifests. The failures, however, were only being witnessed on ARM64 (Apple silicon M2's and likely M1's, though untested), and not on AMD64, for reasons that are still unknown. @dryajov was looking into checking that the erasure coding was indeed being done correctly.

As CacheStore is not used in the node, update the Datastore used in the erasure coding tests to be a RepoStore. This ensures that the K > 1 cases are being tested, where they will produce empty padding blocks in the erasure-coded manifests.
Also added tests for all empty block handling blockstore operations. This showed there was an ambiguous identifier present for `hasBlock`, so one of the two `hasBlock` definitions was removed in `repostore`.
@emizzle
Copy link
Contributor Author

emizzle commented Aug 17, 2023

Once I added the empty block tests, which calls hasBlock, there was an ambiguous call:

Error: ambiguous call; both repostore.hasBlock(self: RepoStore, cid: Cid) [proc declared in /Users/egonat/repos/status-im/nim-codex/codex/stores/repostore.nim(339, 6)] and repostore.hasBlock(self: RepoStore, cid: Cid) [method declared in /Users/egonat/repos/status-im/nim-codex/codex/stores/repostore.nim(244, 8)] match for: (RepoStore, Cid)

It appears that hasBlock was duplicated in repostore but was not being called anywhere else in the tests, or maybe even the codebase, so adding the hasBlock test for empty blocks was a good thing.

@emizzle emizzle merged commit 9cecb68 into master Aug 21, 2023
8 checks passed
@emizzle emizzle deleted the fix/empty-blocks-repostore branch August 21, 2023 02:51
@veaceslavdoina
Copy link
Contributor

@emizzle, in case you need in the future other ARM64/AMD64, we can continue our discussion here - https://github.com/codex-storage/infra-codex/issues/14.

@dryajov
Copy link
Contributor

dryajov commented Aug 21, 2023

For context, in addition to this fix, there is an additional erasure coding issue:

Currently we don't support erasure coding a single block, there has to be at least two blocks to be able to erasure code them.

Leopard will not encode a single buffer, it will simply copy it back. Blocks are treated as a list of symbols and each block occupies one buffer. So, for all intents and purposes, when using one block, we're sending a vector of L*K elements, but K=1 in this case, so leopard simply copies it back.

The options are:

Split the block into two and erasure code the two pieces together
Erasure code with a dummy block
???

In short, it's not a big deal, its a corner case that is probably not worth addressing right now, and when we need to address it, the fix should be relatively straightforward.

For your tests to work, just generate at least two random blocks, other than that it should works as expected.

@dryajov
Copy link
Contributor

dryajov commented Aug 21, 2023

btw, is there an issue to track this already, I couldn't find it.

@emizzle
Copy link
Contributor Author

emizzle commented Aug 21, 2023

Erasure code with a dummy block

@dryajov, isn't the K=1 case already being padded with an empty block, hence the issue arising where empty blocks were not retrievable from the RepoStore? Or by "dummy" block, do you mean something different than an empty block?

The issue that seems out of place is that AMD CPUs are not padding with an empty block in the K=1 case (and therefore no retrieval of empty blocks needed), whereas this issue was present on ARM silicon.

Would the fix for the K=1 case be limited to AMD CPUs?

@dryajov
Copy link
Contributor

dryajov commented Aug 21, 2023

Or by "dummy" block, do you mean something different than an empty block?

Other than empty block. Yeah, I'm guessing it was a combination of several factors, but I've been able to reproduce it with a single block, when there are more than one blocks, K=2, it works as expected, at least with my experiments.

@emizzle
Copy link
Contributor Author

emizzle commented Aug 22, 2023

Issue created: #522

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants