Skip to content

Commit 58b62bf

Browse files
authored
Add baseName to ExternalContent (#71)
* Update rsa modulus endianess to match most protocols * Whoops, fix `low-endian` -> `little-endian` * Typo * Remove `namefilter.md` * Write verification algorithm * Remove `TODO` from the allowed words list... * Change from SHA3 to Blake3, more domain separation * More domain separation strings. * Fix `hashToPrime` usages * Spelling * Switch from AES-GCM to XChaCha20-Poly1305 * Fix constant * Remove `blockCount` restriction * Update rationale * Woords * Add `baseName` to `ExternalContent` * Small clarification * Expand on Section 3.1.4 * Improve references to `baseName` and `name` * Use "its" instead of "the". * Add "kiB" as a valid word * Try using "KB" instead of "kB" * Add "KiB" as word
1 parent e597c02 commit 58b62bf

File tree

2 files changed

+18
-8
lines changed

2 files changed

+18
-8
lines changed

.github/workflows/words-to-ignore.txt

+1
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,7 @@ ethereum
8181
exponentiate
8282
extractable
8383
golang
84+
KiB
8485
idempotence like omnipotence
8586
inline like outline
8687
little-endian

spec/private-wnfs.md

+17-8
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@ type InlineContent = {
171171
type ExternalContent = {
172172
"external": {
173173
key: Key
174+
baseName: NameAccumulator
174175
blockSize: Uint64 // in bytes, at max 262,104
175176
blockCount: Uint64
176177
}
@@ -207,11 +208,19 @@ If the `previous` links contain more than one element, then some CIDs MAY refer
207208

208209
### 3.1.4 Private File
209210

210-
Private file content has two variants: inlined or externalized. Externalized content is held as a separate node in the bucket. Inlined content is kept alongside (and thus is decrypted with) the header.
211+
Private file content has two variants: inlined or externalized. Externalized content stored in separate blocks from the private file block. Inlined content is kept alongside (and thus is decrypted with) the private file block itself.
212+
213+
This makes inline content only suitable for small files, when the content size is much smaller than the IPLD maximum block size (256KiB).
214+
215+
The advantage of inline content is that there's no need for computing `NameAccumulator`s for external content blocks, but the downside is that upon copying a file, you also need to copy the inline content and re-encrypt it with a new key.
216+
217+
It is a sensible default to make use of inline content for file sizes below a certain size threshold, e.g. 10KB.
211218

212219
#### 3.1.4.1 Externalized Content
213220

214-
Since external content blocks are separate from the header, they MUST have a unique `NameAccumulator` derived from a random key (to avoid forcing lookups to go through the header). If the key were derived from the header's key, then the file would be re-encrypted e.g. every time the metadata changed. See [sharded file content access] algorithm for more detail.
221+
Since external content blocks are separate from its header, they each MUST have a `NameAccumulator` that is different than the file's `name` from its header. We allow these names to have an arbitrary `baseName`. For the normal case, the `baseName` is RECOMMENDED to be the file's `name` from its header with the externalized content's encryption `key`, hashed to a prime, added to it as a name segment.
222+
However, the `baseName` is allowed to be anything else, for instance to support copying or moving a file to a different location without having to re-encrypt all of its data.
223+
The [sharded file content access] algorithm contains more information about how to derive each externalized block's name from this `baseName`.
215224

216225
The block size MUST be at least 1 and at maximum $2^{18} - 40 = 262,104$ bytes, as the maximum block size for IPLD is usually $2^{18}$, but 24 initialization vector bytes and 16 authentication tag bytes need to be added to each ciphertext. It is RECOMMENDED to use the maximum block size. An externalized content block is laid out like this:
217226

@@ -262,7 +271,7 @@ However, developers should be aware that such operations wouldn't check the inva
262271

263272
#### 3.1.6.1 Temporal Key
264273

265-
Temporal keys give temporal read access to a certain node and its descendants. It MUST be derived from the skip ratchet for that node, incremented to the relevant revision number. This limits the reader to reading from a their earliest ratchet and forward, but never earlier revisions than that. The derivation algorithm MUST be the skip ratchet [key derivation algorithm][/spec/skip-ratchet.md#21-Key-Derivation] with the domain separation string `wnfs/1.0/revision segment derivation from ratchet`.
274+
Temporal keys give temporal read access to a certain node and its descendants. It MUST be derived from the skip ratchet for that node, incremented to the relevant revision number. This limits the reader to reading from a their earliest ratchet and forward, but never earlier revisions than that. The derivation algorithm MUST be the skip ratchet [key derivation algorithm][skip ratchet key derivation] with the domain separation string `wnfs/1.0/revision segment derivation from ratchet`.
266275

267276
When added to a private directory, it MUST be encrypted with [AES-KWP] and the private directory's temporal key. This prevents readers with only a snapshot key from gaining revision read access.
268277

@@ -457,19 +466,19 @@ Consider the following diagram. An agent may only have access to some nodes, but
457466

458467
`getShards : PrivateFile -> Array<NameAccumulator>`
459468

460-
To calculate the array of HAMT labels for [externalized content], add `key` and `concat(key, encode(i))` for each block index `i` of external content to the file's name like so:
469+
To calculate the array of HAMT labels for [externalized content], add `concat(key, encode(i))` for each block index `i` of external content to the external file content's `baseName` like so:
461470

462471
```ts
463-
function* shardLabels(key: Key, count: Uint64, name: NameAccumulator): Iterable<NameAccumulator> {
464-
for (let i = 0; i < count; i++) {
472+
function* shardLabels(key: Key, blockCount: Uint64, baseName: NameAccumulator): Iterable<NameAccumulator> {
473+
for (let i = 0; i < blockCount; i++) {
465474
// add returns `name` with the parameter added as a name segment
466-
yield name.add(hashToPrime("wnfs/1.0/segment derivation for file block", concat(key, encode(i)), 32))
475+
yield baseName.add(hashToPrime("wnfs/1.0/segment derivation for file block", concat([key, encode(i)]), 32))
467476
}
468477
}
469478
```
470479

480+
- `key`, `blockCount` and `baseName` are fetched from the `PrivateFile`'s external file content record,
471481
- `concat` denotes byte array concatenation,
472-
- `name` is the `NameAccumulator` from the private file's header,
473482
- `encode` is a function that maps a block index to a little-endian byte array encoding of a 64-bit unsigned integer.
474483

475484
## 4.5 Merge

0 commit comments

Comments
 (0)