-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve integration testing client (CodexClient) and json serialization #514
Conversation
) = | ||
expectJsonKind(expectedType, {expectedKind}, json) | ||
|
||
proc fromJson*( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use nim-json-serialization?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was my first choice so that we could keep consistency across the code base, but opted out of it for a few reasons. nim-json-serialization
is so hard to debug that I consistently spend far too much time trying to implement anything. The errors almost always get swallowed and the application crashes in inconspicuous and hard-to-debug ways. It also doesn't help that most of its guts are written in macros, so even attempting to quickly output the serialization error just to see what it was, takes ages to figure out where that is even possible (and even then it's just not that easy). Having a serialization utility inside of codex (or could even be its own lib) that is not overtly magical and not written entirely in macros allows us the ability to debug, to not swallow errors, and to do things like having opt-in field serialization (versus opt-out like nim-json-serialization) was generally the reasoning behind not using nim-json-serialization
. More discussion around these topics can be found here: https://discord.com/channels/613988663034118151/616299964242460682/789378950484852756.
Additionally, for the life of me, I could not override serialisation of UInt256
(so it was not hex-encoded) using nim-json-serialization
inside of codex/utils/json
or codex/rest/json
. I could override deserialization (readValue
) but not serialization (writeValue
), which I really think may boil down to a dependency issue. I'd like to maybe set some time up with you to see if we can work through that, as it was really not obvious and honestly quite frustrating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! That's really helpful to get context about.
Ah, yah I can see that. I ran into some of the difficulty in debugging nim-json-serialization
and the TOML one in the conf.nim
module, and that was a very basic change. There seemed to be a lot of complicated magic and took me a while to debug what was even causing the issue, which was a pain.
Additionally, for the life of me, I could not override serialisation of UInt256 (so it was not hex-encoded) using nim-json-serialization inside of codex/utils/json or codex/rest/json. I could override deserialization (readValue) but not serialization (writeValue), which I really think may boil down to a dependency issue.
That's a pain. 🤔 I thought I saw some UInt256 serialization stuff in stint
or somewhere. That could explain some of those issues.
I'd like to maybe set some time up with you to see if we can work through that, as it was really not obvious and honestly quite frustrating.
Sure! I'd be up for that sometime and like figuring out puzzles. It'd be nice to chat with you too, I don't think we've had the chance to. :)
I'll take a closer look at your setup as well to see if I can spot anything. It looked pretty straightfoward.
P.S. My favorite serde library in Nim is msgpack4nim. It can serialize directly into or from an object making. This avoids allocations making it near the fastest one I've benchmarked. It rivaled flatbuffers
which is famed for it's no-copy speed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I saw some UInt256 serialization stuff in stint or somewhere. That could explain some of those issues
It could have been https://github.com/status-im/nim-eth/blob/master/eth/common/eth_types_json_serialization.nim#L26-L30. Importing these was definitely something I tried, but could not get working.
Sure! I'd be up for that sometime and like figuring out puzzles. It'd be nice to chat with you too, I don't think we've had the chance to. :)
Let's do it 💪 DM me in discord and we can setup a date/time.
P.S. My favorite serde library in Nim is msgpack4nim. It can serialize directly into or from an object making. This avoids allocations making it near the fastest one I've benchmarked. It rivaled flatbuffers which is famed for it's no-copy speed!
msgpack4nim looks pretty cool! After looking at all the places that we use serialization in codex, i do think a universal solution should be implemented at some point. The issue that I see right now is that nim-chronicles
uses nim-json-serialization
due to our use of the json sink. I don't think it's possible to use a different serializer for a json
sink, unfortunately. We could possibly remove the json sink (using formatIt
instead), but I imagine we'll probably need to use the json log output for consumption in eg ElasticSearch or Loggly at some point.
The current client used for integration testing against the REST endpoints for Codex accepts and passes primitive types. This caused a hard to diagnose bug where a `uint` was not being deserialized correctly. In addition, the json de/serializing done between the CodexClient and REST client was not easy to read and was not tested. These changes bring non-primitive types to most of the CodexClient functions, allowing us to lean on the compiler to ensure we're providing correct typings. More importantly, a json de/serialization util was created as a drop-in replacement for the std/json lib, with the main two differences being that field serialization is opt-in (instead of opt-out as in the case of json_serialization) and serialization errors are captured and logged, making debugging serialization issues much easier.
5141d85
to
2164c9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Some small nitpicks that are non-blocking 👍
codex/node.nim
Outdated
@@ -254,7 +254,7 @@ proc requestStorage*( | |||
## - Run the PoR setup on the erasure dataset | |||
## - Call into the marketplace and purchasing contracts | |||
## | |||
trace "Received a request for storage!", cid, duration, nodes, tolerance, reward | |||
trace "Received a request for storage!", cid, duration, nodes, tolerance, reward, proofProbability, nodes, tolerance, reward, collateral, expiry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are duplicated logged information here (nodes, tolerance)
codex/rest/json.nim
Outdated
import std/json | ||
import std/strutils | ||
import pkg/stew/byteutils | ||
# import std/json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed.
# For debugging you can enable logging output with debugX = true | ||
# You can also pass a string in same format like for the `--log-level` parameter | ||
# to enable custom logging levels for specific topics like: debug2 = "INFO; TRACE: marketplace" | ||
|
||
twonodessuite "Integration tests", debug1 = false, debug2 = false: | ||
twonodessuite "Integration tests", debug1 = "INFO;TRACE:node,restapi,erasure", debug2 = false: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can turn off the debug1
logging before merging?
## Problem When Availabilities are created, the amount of bytes in the Availability are reserved in the repo, so those bytes on disk cannot be written to otherwise. When a request for storage is received by a node, if a previously created Availability is matched, an attempt will be made to fill a slot in the request (more accurately, the request's slots are added to the SlotQueue, and eventually those slots will be processed). During download, bytes that were reserved for the Availability were released (as they were written to disk). To prevent more bytes from being released than were reserved in the Availability, the Availability was marked as used during the download, so that no other requests would match the Availability, and therefore no new downloads (and byte releases) would begin. The unfortunate downside to this, is that the number of Availabilities a node has determines the download concurrency capacity. If, for example, a node creates a single Availability that covers all available disk space the operator is willing to use, that single Availability would mean that only one download could occur at a time, meaning the node could potentially miss out on storage opportunities. ## Solution To alleviate the concurrency issue, each time a slot is processed, a Reservation is created, which takes size (aka reserved bytes) away from the Availability and stores them in the Reservation object. This can be done as many times as needed as long as there are enough bytes remaining in the Availability. Therefore, concurrent downloads are no longer limited by the number of Availabilities. Instead, they would more likely be limited to the SlotQueue's `maxWorkers`. From a database design perspective, an Availability has zero or more Reservations. Reservations are persisted in the RepoStore's metadata, along with Availabilities. The metadata store key path for Reservations is ` meta / sales / reservations / <availabilityId> / <reservationId>`, while Availabilities are stored one level up, eg `meta / sales / reservations / <availabilityId> `, allowing all Reservations for an Availability to be queried (this is not currently needed, but may be useful when work to restore Availability size is implemented, more on this later). ### Lifecycle When a reservation is created, its size is deducted from the Availability, and when a reservation is deleted, any remaining size (bytes not written to disk) is returned to the Availability. If the request finishes, is cancelled (expired), or an error occurs, the Reservation is deleted (and any undownloaded bytes returned to the Availability). In addition, when the Sales module starts, any Reservations that are not actively being used in a filled slot, are deleted. Having a Reservation persisted until after a storage request is completed, will allow for the originally set Availability size to be reclaimed once a request contract has been completed. This is a feature that is yet to be implemented, however the work in this PR is a step in the direction towards enabling this. ### Unknowns Reservation size is determined by the `StorageAsk.slotSize`. If during download, more bytes than `slotSize` are attempted to be downloaded than this, then the Reservation update will fail, and the state machine will move to a `SaleErrored` state, deleting the Reservation. This will likely prevent the slot from being filled. ### Notes Based on #514
The current client used for integration testing against the REST endpoints for Codex accepts and passes primitive types. This caused a hard to diagnose bug where a
uint
was not being deserialized correctly.In addition, the json de/serializing done between the CodexClient and REST client was not easy to read and was not tested.
These changes bring non-primitive types to most of the CodexClient functions, allowing us to lean on the compiler to ensure we're providing correct typings when doing integration testing. More importantly, a json de/serialization util was created as a drop-in replacement for the std/json lib, with the main two differences being that field serialization is opt-in (instead of opt-out as in the case of json_serialization) and serialization errors are captured and logged, making debugging serialization issues much easier.
Note
This PR is based on #513