Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: generate sharded DAG index on client and invoke w index/add #1451

Merged
merged 15 commits into from
May 15, 2024

Conversation

alanshaw
Copy link
Member

This PR generates index data for blocks as CAR shards are constructed. Once all shards have been successfully sent, a sharded DAG index is encoded and stored (using blob/add) and then index/add is invoked with the CID of the index as a parameter.

@@ -7,7 +7,7 @@ import varint from 'varint'
*/

/** Byte length of a CBOR encoded CAR header with zero roots. */
const NO_ROOTS_HEADER_LENGTH = 17
const NO_ROOTS_HEADER_LENGTH = 18
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 FML this was wrong. Either it was wrong the whole time or @ipld/car started encoding the header differently.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be great to just export it from @ipld/car

@@ -19,7 +19,7 @@ export async function randomBytes(size) {
} else {
crypto.getRandomValues(chunk)
}
size -= bytes.length
size -= chunk.length
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FML we were creating random data in tests where the last 65,536 bytes were random but the rest was ZEROs.

for (const slice of slices.values()) {
slice[0] += diff
}
controller.enqueue(await encodeCAR(blocks, slices, rootCID))
Copy link
Member Author

@alanshaw alanshaw May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting a root CID in the last CAR shard is a massive PITA. I wish we didn't do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I thought we were putting it in the first shard.

Let's put it as a separate thing that is not in the set of shards. It would be easier to change that now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I would like all shards to be rootless and track the DAG root in upload/add and/or content claims.

@alanshaw alanshaw force-pushed the feat/generate-sharded-dag-index-in-client branch from f8fc6d5 to 6d691de Compare May 14, 2024 16:13
Copy link
Contributor

@gammazero gammazero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. I am not sure about the placement of the DAG block. Maybe better to change the index to keep it separate.

Review of test.index.js was not done as I still need to understand how that code works.


// add the CAR shard itself to the slices
meta.slices.set(meta.cid.multihash, [0, meta.size])
shardIndexes.push(meta.slices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check with ShardedDAGIndex, but I thought this was supposed to be the first slice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are sorted when encoded anyway so it doesn't matter where you put it?

{ message: 'failed index/add invocation' }
)
})
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will any of these tests fail if the DAG block is in the wrong place or missing? If no, then that would be good. Or, change the index to put the DAGblock in a different place than the others.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but there probably should be a test that asserts the additional slice that spans the whole blob is added...

@@ -7,7 +7,7 @@ import varint from 'varint'
*/

/** Byte length of a CBOR encoded CAR header with zero roots. */
const NO_ROOTS_HEADER_LENGTH = 17
const NO_ROOTS_HEADER_LENGTH = 18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be great to just export it from @ipld/car

packages/upload-client/src/index.js Outdated Show resolved Hide resolved
Copy link
Contributor

@vasco-santos vasco-santos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

some suggestions for follow ups in review, + filecoin/offer things I believe MUST be added

@@ -183,5 +195,24 @@ async function uploadBlockStream(
if (!root) throw new Error('missing root CID')

await Upload.add(conf, root, shards, options)

const index = ShardedDAGIndex.create(root)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we have an option to not index? I mean, let's do this by default, but we wanted this to be optional, perhaps just an issue where user can tweak their indexing intentions would be good enough for now

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

// Store the index in the space
const indexDigest = await Blob.add(conf, indexBytes.ok, options)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is great to store in the space, but we will need to look into deleting this on remove from space to not have its usage. Maybe we can just fill in issue now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packages/upload-client/README.md Outdated Show resolved Hide resolved
packages/upload-client/README.md Outdated Show resolved Hide resolved
packages/upload-client/README.md Outdated Show resolved Hide resolved
packages/upload-client/src/index.js Outdated Show resolved Hide resolved
packages/w3up-client/README.md Outdated Show resolved Hide resolved
packages/w3up-client/README.md Outdated Show resolved Hide resolved
@alanshaw alanshaw merged commit a6d9026 into main May 15, 2024
15 checks passed
@alanshaw alanshaw deleted the feat/generate-sharded-dag-index-in-client branch May 15, 2024 15:11
alanshaw pushed a commit that referenced this pull request May 15, 2024
🤖 I have created a release *beep* *boop*
---


##
[15.0.0](upload-api-v14.0.0...upload-api-v15.0.0)
(2024-05-15)


### ⚠ BREAKING CHANGES

* delegated capabilities required to use `uploadFile`, `uploadDirectory`
and `uploadCAR` have changed. In order to use these methods your agent
will now need to be delegated `blob/add`, `index/add`, `filecoin/offer`
and `upload/add` capabilities. Note: no code changes are required.

### Features

* generate sharded DAG index on client and invoke w `index/add`
([#1451](#1451))
([a6d9026](a6d9026))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
alanshaw pushed a commit that referenced this pull request May 15, 2024
🤖 I have created a release *beep* *boop*
---


##
[19.0.0](access-v18.4.0...access-v19.0.0)
(2024-05-15)


### ⚠ BREAKING CHANGES

* delegated capabilities required to use `uploadFile`, `uploadDirectory`
and `uploadCAR` have changed. In order to use these methods your agent
will now need to be delegated `blob/add`, `index/add`, `filecoin/offer`
and `upload/add` capabilities. Note: no code changes are required.

### Features

* generate sharded DAG index on client and invoke w `index/add`
([#1451](#1451))
([a6d9026](a6d9026))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
alanshaw added a commit that referenced this pull request May 15, 2024
🤖 I have created a release *beep* *boop*
---


##
[15.0.0](upload-client-v14.1.1...upload-client-v15.0.0)
(2024-05-15)


### ⚠ BREAKING CHANGES

* delegated capabilities required to use `uploadFile`, `uploadDirectory`
and `uploadCAR` have changed. In order to use these methods your agent
will now need to be delegated `blob/add`, `index/add`, `filecoin/offer`
and `upload/add` capabilities. Note: no code changes are required.

### Features

* generate sharded DAG index on client and invoke w `index/add`
([#1451](#1451))
([a6d9026](a6d9026))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Alan Shaw <[email protected]>
vasco-santos added a commit that referenced this pull request Jun 4, 2024
🤖 I have created a release *beep* *boop*
---


##
[14.0.0](w3up-client-v13.1.1...w3up-client-v14.0.0)
(2024-06-04)


### ⚠ BREAKING CHANGES

* **upload-api:** integrate agent store for idempotence &
invocation/receipt persistence
([#1444](#1444))
* delegated capabilities required to use `uploadFile`, `uploadDirectory`
and `uploadCAR` have changed. In order to use these methods your agent
will now need to be delegated `blob/add`, `index/add`, `filecoin/offer`
and `upload/add` capabilities. Note: no code changes are required.

### Features

* generate sharded DAG index on client and invoke w `index/add`
([#1451](#1451))
([a6d9026](a6d9026))
* **upload-api:** integrate agent store for idempotence &
invocation/receipt persistence
([#1444](#1444))
([c9bf33e](c9bf33e))


### Fixes

* check for blob/accept receipts before blob/add is concluded
([#1459](#1459))
([462518c](462518c))
* export blob client
([#1485](#1485))
([7944077](7944077))
* rename blob and index client capabilities
([#1478](#1478))
([17e3a31](17e3a31))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Vasco Santos <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants