-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wide merkle query #16093
Wide merkle query #16093
Conversation
| Subtree of 'addr | ||
(** What are the 2^k nodes at depth k from the given prefix | ||
address **) | ||
(* TODO: Properly handle versioning *) | ||
(* TODO: Consider additional query to verify subtree suport, for | ||
softfork compatibility *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we aren't able to modify this type here, we'll have to add a converter. Some nodes won't have upgraded, so we need to be able to have a fallback.
For example, it might be natural to change e.g.
| What_child_hashes of 'addr
into e.g.
| What_child_hashes of {address: 'addr; depth: int}
and then the response can be tweaked from
| Child_hashes_are of 'hash * 'hash
to
| Child_hashes_are of 'hash array
I think this gives a natural downgrade path
function
| What_child_hashes {address; depth = _} -> V1.What_child_hashes address
| ...
and upgrade for the response of
function
| Child_hashes_are (x, y) -> V2.Child_hashes_are [x; y]
| ...
As currently written, a node running this code won't be able to sync to the network :'(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be sure I was thinking the same, for now I added it in V1 just to make it compile until I understand better how versioning works, but I agree it should fail right now.
But if I add a V2 I can add another variant or make arbitrary changes to the type, right? or are there additional limitations when making a new version? and for that reason you suggest modifying Child_hashes_are
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand the example, but I see a possible issue:
Send V2 request, receiver converts it into V1 and sends back V1 response, that is converted to V2 response after arrival, although a trivial one.
But in the opposite case sending a V1 and receiving a V2 response shouldn't work, as the V1 node can convert V2 back to V1.
Or does the example assume both types have the same representation and a V2 can be deserialized as V1 without knowing anything about V2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The versioning system should ensure that a V1
request never receives a V2
response. The RPC protocol does version negotiation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, this is where the version negotiation happens.
let to_latest = function | ||
| What_child_hashes addr -> | ||
What_child_hashes addr | ||
| What_contents addr -> | ||
What_contents addr | ||
| Num_accounts -> | ||
Num_accounts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just Fn.id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it may not be necessary, as I was ultimately decided to not change the query, and if it worked without having to_latest
it should still work without it, I'll remove it.
| Subtree of 'hash list | ||
(** The subtree rooted on the requested address has these leaves *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add this instead of modifying Child_hashes_are
? As written, now we have two responses with overlapping behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: array
is usually better, especially where we have a fixed data size that we won't be changing (like here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no reason anymore, I'll unify on Child_hashes_are
.
| Subtree of 'addr | ||
(** What are the 2^k nodes at depth k from the given prefix | ||
address **) | ||
(* TODO: Properly handle versioning *) | ||
(* TODO: Consider additional query to verify subtree suport, for | ||
softfork compatibility *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, this is where the version negotiation happens.
!ci-build-me |
2 similar comments
!ci-build-me |
!ci-build-me |
(* TODO: parameterize *) | ||
let subtree_depth : index = 4 | ||
|
||
(* Provides addresses at an specific depth from this address *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Provides addresses at a specific depth from this address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per this comment, nit:
(* Provides addresses at an specific depth from this address *) | |
(* Provides addresses at a specific depth from this address *) |
(** The requested address's children have these hashes. | ||
May be any power of 2 number of children, and not necessarily | ||
immediate children *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addresses'
expect_children t addr exp_hash ; | ||
Linear_pipe.write_without_pushback_if_open t.queries | ||
(desired_root_exn t, What_child_hashes addr) ) | ||
else expect_children t addr exp_hash ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should be calling expect subtree here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this should just be the old version as discussed
let expect_subtree : 'a t -> Addr.t -> Hash.t -> unit = | ||
fun t parent_addr expected -> | ||
[%log' trace t.logger] | ||
~metadata: | ||
[ ("subtree prefix address", Addr.to_yojson parent_addr) | ||
; ("hash", Hash.to_yojson expected) | ||
] | ||
"Expecting subtree at address $parent_address, expected: $hash" ; | ||
Addr.Table.add_exn t.waiting_parents ~key:parent_addr ~data:expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remove this, in lieu of expect children?
| `Non_power ] = | ||
fun t addr nodes -> | ||
(* let prefix_depth = Addr.depth addr in *) | ||
if Int.is_pow2 (List.length nodes) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this work for 2^0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You could always send back the same hash and leave the node stuck in a loop here. This needs to reject (List.length nodes) < 2
.
|
||
let merge_many : Hash.t list -> index -> Hash.t = | ||
fun nodes depth -> | ||
let final_depth = depth + subtree_depth in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe check if the final depth is greater than the original ledger depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe you should check if the final depth is < ledger depth - children tree depth, which in this case is 6. I am referring to the What Contents are query returning trees with 2^6 children.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also check if responses are 2^4 for the what children are query at some point
Option.value_exn ~message:"Forgot to wait for a node" | ||
(Addr.Table.find t.waiting_parents addr) | ||
in | ||
let merged = merge_many nodes (ledger_depth - Addr.depth addr) in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really clever bravo! the idea being you only get back leaves and can recompute the intermediary hashes by computing parent siblings. This way you encode more information over the network and require less total network bandwidth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't compile, and isn't integrated with the networking layer. Please fix it.
| `Non_power ] = | ||
fun t addr nodes -> | ||
(* let prefix_depth = Addr.depth addr in *) | ||
if Int.is_pow2 (List.length nodes) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. You could always send back the same hash and leave the node stuck in a loop here. This needs to reject (List.length nodes) < 2
.
let ledger_depth = MT.depth mt in | ||
let addresses = intermediate_range ledger_depth a subtree_depth in | ||
let get_hash a = MT.get_inner_hash_at_addr_exn mt a in | ||
let hashes = List.map addresses ~f:get_hash in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use Or_error.try_with
to catch and handle any errors, like the previous version of the code did.
@@ -221,6 +235,28 @@ end = struct | |||
|
|||
type query = Addr.t Query.t | |||
|
|||
(* TODO: parameterize *) | |||
let subtree_depth : index = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the comment, this should be parameterised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About that, the initial idea was for the request to specify this depth, but as I ended up changing only the response that wouldn't be possible.
It could be set on the responder node but that would potentially allow the responder to send very big responses to make you waste time.
What did you have in mind?
@@ -221,6 +235,28 @@ end = struct | |||
|
|||
type query = Addr.t Query.t | |||
|
|||
(* TODO: parameterize *) | |||
let subtree_depth : index = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type here is int
, not the local index
.
| `Non_power ] = | ||
fun t addr nodes -> | ||
(* let prefix_depth = Addr.depth addr in *) | ||
if Int.is_pow2 (List.length nodes) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition makes the protocol 'sender decides' instead of 'requester requests'. You should consider the original design I described: the requester sends the depth below the address that they would like, then the sender responds with a depth less than or equal to that, and the requester only accepts the message if the length is less than or equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with changing the request is that then V2 nodes wouldn't be able to connect to V1 nodes to sync, and no synced V2 node would come into existence to allow other V2s to sync, which wouldn't be softfork compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mediated at the network layer, as I had already mentioned here.
| What_child_hashes a -> | ||
let ledger_depth = MT.depth mt in | ||
let addresses = intermediate_range ledger_depth a subtree_depth in | ||
let get_hash a = MT.get_inner_hash_at_addr_exn mt a in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this hits the ledger DB, looping over these one at a time will be slow. You should use one of the batch functions instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which function are you thinking about?
I considered that but there doesn't seem to be functions for arbitrary batch reads.
There is get_all_accounts_rooted_at_exn
which provides subtrees but not with variable depth, and internally seems to be looping over a list and reading each entry individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the underlying Ledger.get_hash_batch
isn't exposed. I suppose this can wait for later then.
| `Non_power ] = | ||
fun t addr nodes -> | ||
(* let prefix_depth = Addr.depth addr in *) | ||
if Int.is_pow2 (List.length nodes) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List.length
is significantly more expensive than Array.length
. Please use array
above, as requested before
( "hashes sent for subtree on address $address merge \ | ||
to $actualmerge but we expected $expectedmerge" | ||
, [ ("actualmerge", Hash.to_yojson actual) | ||
; ("expectedmerge", Hash.to_yojson expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer snake-case: $actual_merge
, $expected_merge
.
(* TODO: give some error as this shouldn't really happen *) | ||
[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely can happen if you receive 1 hash in response. Please handle this properly.
!ci-build-me |
1 similar comment
!ci-build-me |
This PR should be against |
!ci-build-me |
2 similar comments
!ci-build-me |
!ci-build-me |
!ci-nightly-me |
…mon; use block window duration from runtime config and not compile config; remove compile_config from impacted contexts
@@ -223,6 +223,8 @@ module T = struct | |||
; max_action_elements : int | |||
; zkapp_cmd_limit_hardcap : int | |||
; minimum_user_command_fee : Currency.Fee.Stable.Latest.t | |||
; sync_ledger_default_subtree_depth : int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These really shouldn't be a part of Genesis_constants
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure where it should be. It seems like we are working towards removing compile config. Any suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe create something new like Daemon_constants
that should include a lot of what's in Genesis_constants
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the daemon section of the runtime config makes sense, or as a flag. I'd be fine if it followed the same logic as snark_worker_fee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two constants (three if we extracted account_subtree_height
later) and they need to be threaded through a few different modules- bootstrap_controller, block_producer and consensus. However, they are used only in syncable_ledger.ml and so exposing it along the way doesn’t seem nice in terms of readability. Grouping and passing them in contexts seemed better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved it back to compile time. I'll propose this in a different PR
0d0f6dc
to
c14616d
Compare
!ci-build-me |
!ci-nightly-me |
Co-authored-by: Sai Vegasena <[email protected]>
!ci-build-me |
!ci-build-me |
src/lib/sync_handler/sync_handler.ml
Outdated
fun ~frontier hash query ~context ~trust_system -> | ||
let (module Context) = context in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: ~context:(module Context)
module Context_subtree_depth32 = Make_context (struct | ||
let sync_ledger_max_subtree_depth = 3 | ||
|
||
let sync_ledger_default_subtree_depth = 2 | ||
end) | ||
|
||
module Context_subtree_depth81 = Make_context (struct | ||
let sync_ledger_max_subtree_depth = 8 | ||
|
||
let sync_ledger_default_subtree_depth = 1 | ||
end) | ||
|
||
module Context_subtree_depth82 = Make_context (struct | ||
let sync_ledger_max_subtree_depth = 8 | ||
|
||
let sync_ledger_default_subtree_depth = 2 | ||
end) | ||
|
||
module Context_subtree_depth86 = Make_context (struct | ||
let sync_ledger_max_subtree_depth = 8 | ||
|
||
let sync_ledger_default_subtree_depth = 6 | ||
end) | ||
|
||
module Context_subtree_depth88 = Make_context (struct | ||
let sync_ledger_max_subtree_depth = 8 | ||
|
||
let sync_ledger_default_subtree_depth = 8 | ||
end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there should also be a test for sync_ledger_max_subtree_depth < sync_ledger_default_subtree_depth
.
@@ -44,15 +84,33 @@ module Answer = struct | |||
|
|||
let to_latest acct_to_latest = function | |||
| Child_hashes_are (h1, h2) -> | |||
Child_hashes_are (h1, h2) | |||
V2.Child_hashes_are (List.to_array [ h1; h2 ]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
V2.Child_hashes_are (List.to_array [ h1; h2 ]) | |
V2.Child_hashes_are [| h1; h2 |] |
let from_v2 : ('a, 'b) V2.t -> ('a, 'b) t = function | ||
| Child_hashes_are h -> | ||
if Array.length h = 2 then Child_hashes_are (h.(0), h.(1)) | ||
else failwith "can't downgrade wide query" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This really should return a Result
IMO.
in | ||
let left = intermediate_range ledger_depth left (i - 1) in | ||
let right = intermediate_range ledger_depth right (i - 1) in | ||
left @ right |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this list concatenation results in n^2
allocations. We also immediately convert it to an array immediately later. I'd recommend using an Array.init
and appending the index, something like this.
match subtree_depth with | ||
| n | ||
when n >= 1 | ||
&& n <= Context.compile_config.sync_ledger_max_subtree_depth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a protocol violation. We should just take the min
of the configured max and n
.
let less_than_max = | ||
len <= Int.pow 2 Context.compile_config.sync_ledger_max_subtree_depth |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't care if it's less than our max. The only relevant maximum is the sending node's maximum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the receiving node wouldn't want to process some arbitrarily long list of hashes. Otherwise what is the need for specifying depth in the query if receiver accepts any depth?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other test above checks that it's less than what we requested. The max is just what we're locally happy to respond with, unrelated to the responses we receive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I added the requested depth check and this comment is for max. Sorry, you're right, we don't need the max
let (module Context) = t.context in | ||
let open Context in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB: you can do let open (val t.context)
.
!ci-build-me |
!ci-nightly-me |
return (Or_error.error_string "No Frontier") | ||
in | ||
let result = | ||
Result.of_option answer | ||
~error: | ||
(Error.createf | ||
!"Refusing to answer sync ledger query for ledger_hash: \ | ||
%{sexp:Ledger_hash.t}" | ||
ledger_hash ) | ||
Result.map_error answer ~f:(fun e -> | ||
Error.createf | ||
!"Refusing to answer sync ledger query for ledger_hash: \ | ||
%{sexp:Ledger_hash.t}. Error: %s" | ||
ledger_hash (Error.to_string_hum e) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing to me: why are we providing an error message No frontier
in answer
to just immediately replace it with map_error
?
Relatedly, we then match on result
below, so we will record Trust_system.Actions.Requested_unknown_item
even if it's our fault for not having a frontier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This behaviour is unchanged from before, so I won't block on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm only propagating the actual error to test error cases.
we will record Trust_system.Actions.Requested_unknown_item even if it's our fault for not having a frontier.
This seems to be the behaviour for other messages too. Opened a ticket #16383
let sync_ledger_max_subtree_depth = 8 | ||
|
||
let sync_ledger_default_subtree_depth = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is a strange one: shouldn't we disallow default_subtree_depth = 0
? If we request depth 0
, I would understand that as meaning that we expect to get back [|hash|]
when we request the depth-0 children of hash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is disallowed. It was an easy way to test the case. I could add a top level assertion for this in node_config and runtime config so that nodes aren't configured with 0 sub tree depth
match check_answer query answ_or_error with | ||
| `Answer answ -> | ||
let%bind () = | ||
if match query with What_contents _ -> true | _ -> false then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: could use the match here directly.
(* TODO: parameterize *) | ||
let subtree_depth : index = 4 | ||
|
||
(* Provides addresses at an specific depth from this address *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per this comment, nit:
(* Provides addresses at an specific depth from this address *) | |
(* Provides addresses at a specific depth from this address *) |
let open (val t.context) in | ||
let len = Array.length nodes in | ||
let is_power = Int.is_pow2 len in | ||
let is_more_than_two = len >= 2 in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: is_more_than_two
is misleading, 2
is not more than 2, but gives is_more_than_two = true
.
let is_power = Int.is_pow2 len in | ||
let is_more_than_two = len >= 2 in | ||
let subtree_depth = Int.ceil_log2 len in | ||
let less_than_requested = subtree_depth <= requested_depth in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: similarly less_than_requested
is not quite accurate.
Explain your changes:
Adding support for wide merkle queries that request more than 2 nodes during sync.
Explain how you tested your changes:
Ran V2 node, for this V1 -> V2 case I started the V2, got it to connect to devnet, and then started the V1 with the running V2 as only peer.
Checklist: