Skip to content

Conversation

rvagg
Copy link
Member

@rvagg rvagg commented Jan 9, 2021

BREAKING CHANGE!

See cborg for additional details.

  • Will be much more strict with data types on encode and decode, only
    IPLD Data Model values will be accepted, all others will be rejected
    (including undefined). No other tags than 42 will be involved in encode
    or decode.
  • Will strictly reject decode of data where integers are encoded using more
    bytes than are required, as per deterministic DAG-CBOR rules.
  • Strictly only allow tag 42 and no others. Also no exotic types like Simple Values.
  • Indefinite length items will not be accepted for decode.
  • BigInt will be used for integers decoded outside of the safe integer range.
    They are also accepted for encodes, within the 64-bit range. The exotic
    bignumber type from borc is no longer supported.
  • Floats are now always encoded as 64-bit, to match the updated DAG-CBOR spec
    and the Go implementation.
  • Uint8Arrays are native through the stack now, Buffer isn't required for
    browser bundling.
  • Dependency tree is greatly simplified, should result in smaller bundle (although I haven't looked at that)
  • Code here is greatly simplified, we're able to remove the object decode preparation and hook into the object tokenisation process in cborg and piggy-back that.

I will follow up with a comment about performance characteristics.

We may want to hold off on merging this PR until ipld/specs#342 is resolved because it would be good to get strictness about rejecting the IEEE754 specials into here if that's the way we're going to go because that will be a breaking change as well. I have flags in cborg already to handle this on decode and encode can be handled here.

BREAKING CHANGE!

* Will be much more strict with data types on encode and decode, only
  IPLD Data Model values will be accepted, all others will be rejected
  (including `undefined`). No other tags than 42 will be involved in encode
  or decode.
* Will strictly reject decode of data where integers are encoded using more
  bytes than are required, as per deterministic DAG-CBOR rules.
* Strictly only tag 42 and no others. Also no exotic types like Simple Values.
* Indefinite length items will not be accepted for decode.
* BigInt will be used for integers decoded outside of the safe integer range.
  They are also accepted for encodes, within the 64-bit range. The exotic
  bignumber type from borc is no longer supported.
* Floats are now always encoded as 64-bit, to match the updated DAG-CBOR spec
  and the Go implementation.
* Uint8Arrays are native through the stack now, `Buffer` isn't required for
  browser bundling.
@rvagg rvagg requested a review from mikeal January 9, 2021 08:49
@rvagg
Copy link
Member Author

rvagg commented Jan 9, 2021

Performance

It's been very hard to pin down a consistent performance story. Because of the different approaches taken to both encode and decode taken by borc and cborg it's very dependent on the object forms and size and even the way that the codec is used.

The majority of performance differences come down to byte array handling in JavaScript and Node.js - allocations, copies, and UTF8 handling. borc's main performance key is being able to lean on Node.js' Buffer. This can be seen mainly for decode: borc does an up-front Node.js Buffer heap allocation and a memcpy of the incoming byte array into that and decode from there - which gives it all the performance benefits of Buffer and its various utilities even when you feed it a Uint8Array. So utf8 decoding, number reading (DataView is the browser alternative and it sucks) and slicing come in to play with this.

But this comes at a cost because the "heap" size needs to be known up-front. Which is where some of our headaches with borc have come in and the configureEncoder() and its auto-grow feature that we added a couple of years ago to handle larger objects. The heap persists while the instance of this codec persists. Decode a very large object and a large-enough heap will persist in memory afterward. Growing the heap has costs too because you discard the old Buffer and allocate a new one.

cborg doesn't use this approach, it just assumes a Uint8Array and parses through it. It will lean on Buffer utilities where available and where it can get away with it. But not in a way that forces it into the browser, it'll only try and do it when !process.browser and via global.Buffer to avoid triggering auto-bundling. But it can only make limited use of it when it has to assume the incoming bytes are simply a Uint8Array.

For encode the approach is totally different - borc allocates a plain array and puts bytes in it, converting it to a Uint8Array at the end. cborg allocates small chunks, fills them up, slices and dices, managing a list of chunks, and then concatenates them in the end (mostly incurring memcpy's, but occasionally just a slice). This is like a write version of bl and pretty handy for this type of process.

In narrow benchmark situations, configured for enough heap, borc is difficult to beat for different object types head-to-head. I've formed a suite of benchmarks that test against randomised IPLD-style data and the results vary a lot. For one-offs, borc is very slow. Its up-front needs are very large and its "fast" mode (that we use here) requires persisting a Decoder instance. I wasted a lot of time chasing minor improvements. That's basically what I've spent the last month doing with cborg.

But, when we end up consuming it, borc becomes a lot slower. The need to deal strictly with Uint8Arrays here mean some costly conversions along the way, for both encode and decode. We also do circular reference checking and replacing CIDs with these Tagged objects for encode. So we end up walking an entire object twice before we even hand it off to borc. cborg does circular reference checking itself and I created hooks in cborg to handle everything we need to do to prepare and check objects. So there's a single object walk for all of the encode process. Auto-growing the heap also imposes a cost.

The attempt to remove Buffer from borc and make it friendly for browser bundling resulted in more than halving both decode and encode speed by borc's own benchmarks. I think if we ran similar benchmarks at the @ipld/dag-cbor level that these differences might be lower because of the forced imposition of Uint8Array by this codec from outside. Some of my benchmaking involved testing against the branch in that PR and seemed to indicate this.

These are some benchmarks consuming @ipld/dag-cbor with borc vs cborg, various sizes of random objects that I'm interested in (using ipld-garbage):

rnd-100 @ 1,000
        encode (avg 42 b): cborg @ 145,438 op/s / borc @ 115,065 op/s = 126.4 %
        decode (avg 42 b): cborg @ 227,360 op/s / borc @ 206,573 op/s = 110.1 %
rnd-300 @ 1,000
        encode (avg 106 b): cborg @ 122,396 op/s / borc @ 82,172 op/s = 149 %
        decode (avg 106 b): cborg @ 146,816 op/s / borc @ 160,064 op/s = 91.7 %
rnd-nomap-300 @ 1,000
        encode (avg 84 b): cborg @ 202,189 op/s / borc @ 113,858 op/s = 177.6 %
        decode (avg 84 b): cborg @ 214,355 op/s / borc @ 201,394 op/s = 106.4 %
rnd-nofloat-300 @ 1,000
        encode (avg 117 b): cborg @ 132,985 op/s / borc @ 85,362 op/s = 155.8 %
        decode (avg 117 b): cborg @ 158,400 op/s / borc @ 151,470 op/s = 104.6 %
rnd-nostr-300 @ 1,000
        encode (avg 93 b): cborg @ 63,444 op/s / borc @ 38,413 op/s = 165.2 %
        decode (avg 93 b): cborg @ 71,598 op/s / borc @ 72,840 op/s = 98.3 %
rnd-nostrbyts-300 @ 1,000
        encode (avg 99 b): cborg @ 109,446 op/s / borc @ 55,426 op/s = 197.5 %
        decode (avg 99 b): cborg @ 111,956 op/s / borc @ 117,498 op/s = 95.3 %
rnd-1000 @ 1,000
        encode (avg 296 b): cborg @ 105,936 op/s / borc @ 54,780 op/s = 193.4 %
        decode (avg 296 b): cborg @ 110,225 op/s / borc @ 116,299 op/s = 94.8 %
rnd-2000 @ 1,000
        encode (avg 572 b): cborg @ 99,403 op/s / borc @ 41,749 op/s = 238.1 %
        decode (avg 572 b): cborg @ 92,907 op/s / borc @ 93,719 op/s = 99.1 %
rnd-fil-100 @ 1,000
        encode (avg 39 b): cborg @ 237,360 op/s / borc @ 156,106 op/s = 152.1 %
        decode (avg 39 b): cborg @ 274,942 op/s / borc @ 258,343 op/s = 106.4 %
rnd-fil-300 @ 1,000
        encode (avg 91 b): cborg @ 187,871 op/s / borc @ 106,255 op/s = 176.8 %
        decode (avg 91 b): cborg @ 199,600 op/s / borc @ 186,876 op/s = 106.8 %
rnd-fil-500 @ 1,000
        encode (avg 144 b): cborg @ 169,976 op/s / borc @ 92,259 op/s = 184.2 %
        decode (avg 144 b): cborg @ 163,685 op/s / borc @ 159,083 op/s = 102.9 %
rnd-fil-1000 @ 1,000
        encode (avg 251 b): cborg @ 165,503 op/s / borc @ 75,471 op/s = 219.3 %
        decode (avg 251 b): cborg @ 140,718 op/s / borc @ 135,864 op/s = 103.6 %
rnd-fil-2000 @ 1,000
        encode (avg 459 b): cborg @ 157,214 op/s / borc @ 57,143 op/s = 275.1 %
        decode (avg 459 b): cborg @ 111,776 op/s / borc @ 109,453 op/s = 102.1 %

So you can see the wide variability there. I really wanted decode to be much faster than that but it's really difficult to squeeze any more without improvements in JS TypedArray performance. There are some possibilities for flattening the decode process a little more but it will cost in other ways, so I'm avoiding that for now.

We also have real-world data to test against. I have a 3.5Gb CAR file containing all blocks that form a single Filecoin headers graph. I can test streaming read/decode time with both borc and cborg versions of @ipld/dag-cbor and also a read/decode + encode/write and compare total times. There's a lot of I/O, async and CAR file overhead in these numbers, but the size of the file helps drown that out.

On my machine I get numbers like this:

read-only
borc took 698.5 seconds
cborg took 494.9 seconds
difference: 70.8% / 1.41 x

read/write
borc took 4,297.6 seconds
cborg took 894.5 seconds
difference: 20.8% / 4.8 x

They result in the same output CAR file, so that helps verify consisten encode. Unfortunately it's not the same CAR file as the input file! This is because of Filecoin encoding non-UTF8 data into CBOR strings and JavaScript not being able to deal with that. But that's a separate problem to be solved; I have some thoughts about allowing a way to optionally reach in behind during decode when you know the block you're decoding has invalid UTF8 strings so you can at least access the data.

@rvagg
Copy link
Member Author

rvagg commented Jan 11, 2021

I've added a commit that disallows NaN, Infinity and -Infinity, this is pending acceptance of ipld/specs#344

@rvagg rvagg merged commit 966dd4e into master Jan 19, 2021
@rvagg rvagg deleted the rvagg/cborg branch January 19, 2021 03:06
@matheus23 matheus23 mentioned this pull request Jun 15, 2021
5 tasks
matheus23 added a commit to oddsdk/ts-odd that referenced this pull request Jun 22, 2021
This PR's changes are mostly centered around the `package.json` file:
* Switch from borc to cborg. This gives us more precise typescript types for cbor. ([Here is why js-ipfs switched](ipld/js-dag-cbor#13))
* Remove rollup. It would only apply on the `index.[es5/cjs/umd].js` files and all other modules, e.g. `fs/protocol/private/namefilter.js` (I need that for account recovery) would be unchanged from the output from `tsc`. However, we can just remove rollup by:
  - Adding `"browser": { crypto: false }` to signal to bundlers to ignore `require("crypto")` in dependencies (noble-ec25519 has these requires, but is careful not to actually use them in browser contexts)
  - Making dependencies always `import { Buffer } from 'buffer'`.
  - Creating our `index.min.js` (previously `index.umd.js`) file with something else
* Introduce `esbuild` for creating a convenient browser bundle at `dist/index.min.js`. It's fast. Like, <1 second. (Rollup took >20 secs for just the index.umd.js)
* Move library files to `lib/`. We keep `dist/index.min.js`, though. I think this distinction is nice.
* Introduce export maps. `import "webnative/fs/protocol/private/namefilter"` is now possible instead of `import "webnative/dist/fs/protocol/private/namefilter"`.
* Switch to github actions from travis CI
* Add some utilities for better logging of errors in puppeteer. We now get console errors and unhandled exceptions that happen *in the browser into our jest output*.
* Don't transpile "newer" features like async/await during `tsc`. We require node version >=15, which supports newer features like that anyway.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant