-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erlay meta-issue: gradual deployment #11
Comments
These results should be confirmed running a real node, too... Running an erlay node with 7 legacy peers and 1 erlay, with the given delays. And 4/4 too. And compare it to the legacy node, and compare the bandwidth (splitting between INV and TX). The relevant Bitcoin Core fields are The results won't directly reflect the simulation results above, because we won't see any reduction on a legacy node (it happens only at scale). However, it would be cool to confirm erlay does good while some of its peers are legacy. |
Thanks for putting this together. I was trying to reason about what percentage of the network we could reasonably expect to upgrade quickly. It looks like >50% of the network had upgraded 6 months after v0.21 was released, but taproot probably had something to do with it...
Just to clarify. Would this mean, as we're increasing the number of default outbound connections on a node, we would only allow reconciling peers for those extra connections? |
Yeah, possibly. I would expect 25% in a year, and 50% in 2 years. And that's after we make it enabled by default (which is probably not right away).
I didn't mean that, but I think a useful general strategy would be: have of all outbounds, no more than 8 should be legacy. Something like that. The rest is implementation details. |
Another clarification: does this mean for every tx, the ratio of inv messages sent by legacy nodes vs Erlay nodes is 1.2 to 0.8 aka 3 to 2? So legacy nodes are sending 50% more messages than Erlay nodes? |
No, this is specifically about tx relaying work. Legacy nodes send 1.5 more TX bandwidth (they just take take the work off of erlay nodes). That can be minimized by reducing |
Erlay gradual deployment
We cannot expect everyone to enable Erlay at once. Furthermore, it will probably take couple years before we reach even half of the network to enable it, just based on how fast users update their nodes.
That’s why we need to understand the impact of it at different scales of deployment, and potentially tune parameters for the best outcomes.
I define the following configurations.
C10. ~10% deployment
C25. ~25% deployment
C50. ~50% deployment
C90. ~75% deployment
C90. ~100% deployment
I think these configurations can be roughly followed on new releases. Say, we see a change from 10% to 25% in 2023, then we update the config for new nodes according to the suggestions below.
To modify this value, update
init.2.reconcile_percent
in config.The configurations will affect only in/out average delay before flooding a tx to the node, or before adding it to the reconciliation set. These delays are used to obfuscate transaction origin from timing analysis.
The relevant simulator fields are:
in_relay_delay_recon_peer/out_relay_delay_recon_peer
These fields only apply to reconciliation-enabled nodes, when they choose the delay for both reconciling and legacy peers they have. For legacy nodes, the delays are always 5/2, as we’re not planning to change that.
The full config (in which, only these few fields will be modified based on configuration) can be found here.
To avoid any imbalances (bandwidth, relay speed, etc), the configs (mainly just relay delays) should be not very different across phases (we could also do if locally we reached a certain % of conns, switch to other config values to make it even more balanced).
Note: for these experiments/settings I use 5 for both in_flood_peers_percent/out_flood_peers_percent. I think this is what we would do in the real network too: as long as at least 25% of nodes are legacy, they would sustain low latency (and this would reduce overall bandwidth slightly).
C10
For anything <= 10%, there will be very few Erlay connections in the network (every Erlay-enabled node statistically will have at most 1), so it’s hard to expect any real savings.
It’s possible to still get real gains if a node manually restricts itself from connecting to non-erlay nodes (e.g., via -connect CLI).
I suggest flood delays of 5s/2s for reconciling nodes.
This gives a little gain on Erlay nodes (7.65 INV per tx), and no much effect beyond that.
Even though there is no real effect here, these nodes will be ready to participate in future reconciliations, which is valuable
C25
For <25%, it’s possible to start a bit more gains on both nodes.
The most balanced configuration (7.89 INV on legacy and 7.49 INV on erlay) happens with the following config:
C50
Even more gains come here: 7.39 INV on legacy and 5.98 INV on erlay nodes, with the following config:
An interesting alternative: we can get 6.68 INV and 6.51INV with the following configurations:
C75
Gains: 6.29 INV on legacy and 4.29 on erlay nodes with:
Latency
In 100% Erlay, we decided to cap flooding at ~10%, since that would provide a tolerable latency increase (from 3.5 to 6s), given other parameters.
All these configurations provide a lower latency than 100% erlay.
Connectivity increase
One of the benefits of Erlay is to allow connectivity increase for almost no bandwidth increase.
Say, those 50% erlay nodes with the given config were to use 12 outbound connections instead of 8.
To play with this, update
out_peers_recon
, and potentiallyreconciliation_interval
too.My preliminary experiments show that the bandwidth goes up from 5.98 INV to 7.5, while also slightly increasing the bandwidth of legacy nodes from 7.39 INV to 7.51.
It seems like this is what’s happening: since 50% of nodes is legacy, 50% those extra conns would be legacy. Given that connectivity is increased on half of the nodes, on average this should result in 2 extra connections per node in the network. In legacy flooding, this would be 2 extra INV per tx.
In our half-erlay setting, however, this is just (1.52 + 0.12) / 2 = 0.82 extra INV per tx. Or, for Erlay nodes, it’s 1.52 extra instead of, potentially, 4.
A smarter thing would be to probably make those new connections Erlay-only. However, I’m unsure it’s good for the topology to group nodes like that.
TX work
Another goal while picking Erlay configuration was avoid imbalance of the workload on nodes.
Across reachable/private angle, the workload distribution remains the same.
Across erlay/legacy angle, legacy nodes take slightly more workload. E.g., in 50/50 case, they take (1.2 against 0.8 tx messages per tx, across the entire network).
I don’t think this is a big deal: 1) legacy nodes will get (both in/out) INV traffic reduction just from the existence of erlay nodes; 2) the distribution of workload probably already varies a lot, based on the node connectivity, etc.
The text was updated successfully, but these errors were encountered: