Improve integration testing client (CodexClient) and json serialization #514

emizzle · 2023-08-14T06:08:32Z

The current client used for integration testing against the REST endpoints for Codex accepts and passes primitive types. This caused a hard to diagnose bug where a uint was not being deserialized correctly.

In addition, the json de/serializing done between the CodexClient and REST client was not easy to read and was not tested.

These changes bring non-primitive types to most of the CodexClient functions, allowing us to lean on the compiler to ensure we're providing correct typings when doing integration testing. More importantly, a json de/serialization util was created as a drop-in replacement for the std/json lib, with the main two differences being that field serialization is opt-in (instead of opt-out as in the case of json_serialization) and serialization errors are captured and logged, making debugging serialization issues much easier.

Note

~~This PR is based on #513~~

elcritch · 2023-08-14T08:19:18Z

codex/utils/json.nim

+) =
+  expectJsonKind(expectedType, {expectedKind}, json)
+
+proc fromJson*(


Why not use nim-json-serialization?

It was my first choice so that we could keep consistency across the code base, but opted out of it for a few reasons. nim-json-serialization is so hard to debug that I consistently spend far too much time trying to implement anything. The errors almost always get swallowed and the application crashes in inconspicuous and hard-to-debug ways. It also doesn't help that most of its guts are written in macros, so even attempting to quickly output the serialization error just to see what it was, takes ages to figure out where that is even possible (and even then it's just not that easy). Having a serialization utility inside of codex (or could even be its own lib) that is not overtly magical and not written entirely in macros allows us the ability to debug, to not swallow errors, and to do things like having opt-in field serialization (versus opt-out like nim-json-serialization) was generally the reasoning behind not using nim-json-serialization. More discussion around these topics can be found here: https://discord.com/channels/613988663034118151/616299964242460682/789378950484852756.

Additionally, for the life of me, I could not override serialisation of UInt256 (so it was not hex-encoded) using nim-json-serialization inside of codex/utils/json or codex/rest/json. I could override deserialization (readValue) but not serialization (writeValue), which I really think may boil down to a dependency issue. I'd like to maybe set some time up with you to see if we can work through that, as it was really not obvious and honestly quite frustrating.

Thanks! That's really helpful to get context about.

Ah, yah I can see that. I ran into some of the difficulty in debugging nim-json-serialization and the TOML one in the conf.nim module, and that was a very basic change. There seemed to be a lot of complicated magic and took me a while to debug what was even causing the issue, which was a pain.

Additionally, for the life of me, I could not override serialisation of UInt256 (so it was not hex-encoded) using nim-json-serialization inside of codex/utils/json or codex/rest/json. I could override deserialization (readValue) but not serialization (writeValue), which I really think may boil down to a dependency issue.

That's a pain. 🤔 I thought I saw some UInt256 serialization stuff in stint or somewhere. That could explain some of those issues.

I'd like to maybe set some time up with you to see if we can work through that, as it was really not obvious and honestly quite frustrating.

Sure! I'd be up for that sometime and like figuring out puzzles. It'd be nice to chat with you too, I don't think we've had the chance to. :)

I'll take a closer look at your setup as well to see if I can spot anything. It looked pretty straightfoward.

P.S. My favorite serde library in Nim is msgpack4nim. It can serialize directly into or from an object making. This avoids allocations making it near the fastest one I've benchmarked. It rivaled flatbuffers which is famed for it's no-copy speed!

I thought I saw some UInt256 serialization stuff in stint or somewhere. That could explain some of those issues

It could have been https://github.com/status-im/nim-eth/blob/master/eth/common/eth_types_json_serialization.nim#L26-L30. Importing these was definitely something I tried, but could not get working.

Sure! I'd be up for that sometime and like figuring out puzzles. It'd be nice to chat with you too, I don't think we've had the chance to. :)

Let's do it 💪 DM me in discord and we can setup a date/time.

P.S. My favorite serde library in Nim is msgpack4nim. It can serialize directly into or from an object making. This avoids allocations making it near the fastest one I've benchmarked. It rivaled flatbuffers which is famed for it's no-copy speed!

msgpack4nim looks pretty cool! After looking at all the places that we use serialization in codex, i do think a universal solution should be implemented at some point. The issue that I see right now is that nim-chronicles uses nim-json-serialization due to our use of the json sink. I don't think it's possible to use a different serializer for a json sink, unfortunately. We could possibly remove the json sink (using formatIt instead), but I imagine we'll probably need to use the json log output for consumption in eg ElasticSearch or Loggly at some point.

The current client used for integration testing against the REST endpoints for Codex accepts and passes primitive types. This caused a hard to diagnose bug where a `uint` was not being deserialized correctly. In addition, the json de/serializing done between the CodexClient and REST client was not easy to read and was not tested. These changes bring non-primitive types to most of the CodexClient functions, allowing us to lean on the compiler to ensure we're providing correct typings. More importantly, a json de/serialization util was created as a drop-in replacement for the std/json lib, with the main two differences being that field serialization is opt-in (instead of opt-out as in the case of json_serialization) and serialization errors are captured and logged, making debugging serialization issues much easier.

AuHau

LGTM! Some small nitpicks that are non-blocking 👍

AuHau · 2023-08-30T05:39:04Z

codex/node.nim

@@ -254,7 +254,7 @@ proc requestStorage*(
  ## - Run the PoR setup on the erasure dataset
  ## - Call into the marketplace and purchasing contracts
  ##
-  trace "Received a request for storage!", cid, duration, nodes, tolerance, reward
+  trace "Received a request for storage!", cid, duration, nodes, tolerance, reward, proofProbability, nodes, tolerance, reward, collateral, expiry


There are duplicated logged information here (nodes, tolerance)

AuHau · 2023-08-30T05:39:40Z

codex/rest/json.nim

-import std/json
-import std/strutils
-import pkg/stew/byteutils
+# import std/json


This can be removed.

AuHau · 2023-08-30T12:12:24Z

tests/integration/testIntegration.nim

 # For debugging you can enable logging output with debugX = true
 # You can also pass a string in same format like for the `--log-level` parameter
 # to enable custom logging levels for specific topics like: debug2 = "INFO; TRACE: marketplace"

-twonodessuite "Integration tests", debug1 = false, debug2 = false:
+twonodessuite "Integration tests", debug1 = "INFO;TRACE:node,restapi,erasure", debug2 = false:


I guess we can turn off the debug1 logging before merging?

## Problem When Availabilities are created, the amount of bytes in the Availability are reserved in the repo, so those bytes on disk cannot be written to otherwise. When a request for storage is received by a node, if a previously created Availability is matched, an attempt will be made to fill a slot in the request (more accurately, the request's slots are added to the SlotQueue, and eventually those slots will be processed). During download, bytes that were reserved for the Availability were released (as they were written to disk). To prevent more bytes from being released than were reserved in the Availability, the Availability was marked as used during the download, so that no other requests would match the Availability, and therefore no new downloads (and byte releases) would begin. The unfortunate downside to this, is that the number of Availabilities a node has determines the download concurrency capacity. If, for example, a node creates a single Availability that covers all available disk space the operator is willing to use, that single Availability would mean that only one download could occur at a time, meaning the node could potentially miss out on storage opportunities. ## Solution To alleviate the concurrency issue, each time a slot is processed, a Reservation is created, which takes size (aka reserved bytes) away from the Availability and stores them in the Reservation object. This can be done as many times as needed as long as there are enough bytes remaining in the Availability. Therefore, concurrent downloads are no longer limited by the number of Availabilities. Instead, they would more likely be limited to the SlotQueue's `maxWorkers`. From a database design perspective, an Availability has zero or more Reservations. Reservations are persisted in the RepoStore's metadata, along with Availabilities. The metadata store key path for Reservations is ` meta / sales / reservations / <availabilityId> / <reservationId>`, while Availabilities are stored one level up, eg `meta / sales / reservations / <availabilityId> `, allowing all Reservations for an Availability to be queried (this is not currently needed, but may be useful when work to restore Availability size is implemented, more on this later). ### Lifecycle When a reservation is created, its size is deducted from the Availability, and when a reservation is deleted, any remaining size (bytes not written to disk) is returned to the Availability. If the request finishes, is cancelled (expired), or an error occurs, the Reservation is deleted (and any undownloaded bytes returned to the Availability). In addition, when the Sales module starts, any Reservations that are not actively being used in a filled slot, are deleted. Having a Reservation persisted until after a storage request is completed, will allow for the originally set Availability size to be reclaimed once a request contract has been completed. This is a feature that is yet to be implemented, however the work in this PR is a step in the direction towards enabling this. ### Unknowns Reservation size is determined by the `StorageAsk.slotSize`. If during download, more bytes than `slotSize` are attempted to be downloaded than this, then the Reservation update will fail, and the state machine will move to a `SaleErrored` state, deleting the Reservation. This will likely prevent the slot from being filled. ### Notes Based on #514

elcritch reviewed Aug 14, 2023

View reviewed changes

Base automatically changed from fix/empty-blocks-repostore to master August 21, 2023 02:51

emizzle added 2 commits August 28, 2023 13:21

Update integration test to use nodes=2 and tolerance=1

2164c9f

emizzle force-pushed the feat/integration/improve-codex-client branch from 5141d85 to 2164c9f Compare August 28, 2023 03:25

AuHau previously approved these changes Aug 30, 2023

View reviewed changes

emizzle mentioned this pull request Sep 1, 2023

[marketplace] Availability improvements #535

Merged

clean up

4c6109d

emizzle dismissed AuHau’s stale review via 4c6109d September 1, 2023 00:23

emizzle merged commit 37b3d99 into master Sep 1, 2023
8 checks passed

emizzle deleted the feat/integration/improve-codex-client branch September 1, 2023 05:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve integration testing client (CodexClient) and json serialization #514

Improve integration testing client (CodexClient) and json serialization #514

emizzle commented Aug 14, 2023 •

edited

Loading

elcritch Aug 14, 2023

emizzle Aug 14, 2023 •

edited

Loading

elcritch Aug 14, 2023

emizzle Aug 15, 2023

AuHau left a comment

AuHau Aug 30, 2023

AuHau Aug 30, 2023

AuHau Aug 30, 2023

Improve integration testing client (CodexClient) and json serialization #514

Improve integration testing client (CodexClient) and json serialization #514

Conversation

emizzle commented Aug 14, 2023 • edited Loading

Note

elcritch Aug 14, 2023

Choose a reason for hiding this comment

emizzle Aug 14, 2023 • edited Loading

Choose a reason for hiding this comment

elcritch Aug 14, 2023

Choose a reason for hiding this comment

emizzle Aug 15, 2023

Choose a reason for hiding this comment

AuHau left a comment

Choose a reason for hiding this comment

AuHau Aug 30, 2023

Choose a reason for hiding this comment

AuHau Aug 30, 2023

Choose a reason for hiding this comment

AuHau Aug 30, 2023

Choose a reason for hiding this comment

emizzle commented Aug 14, 2023 •

edited

Loading

emizzle Aug 14, 2023 •

edited

Loading