Erlay meta-issue: mainnet testing #8

naumenkogs · 2021-08-19T17:53:24Z

Based on the experimental results from #7, I currently suggest using the following patch for testing. I currently run 12 Erlay-supporting nodes running that patch.

To test Erlay, you can connect to my nodes with the following bitcoind CLI option: -maxconnections=0 -addnode=143.198.185.21:8201 -addnode=143.198.185.21:8202 -addnode=143.198.185.21:8203 -addnode=143.198.185.21:8204 -addnode=143.198.185.21:8205 -addnode=143.198.185.21:8206 -addnode=143.198.185.21:8207 -addnode=143.198.185.21:8208. I might be restarting those nodes from time to time, so expect that.

My nodes are currently pruned, so please sync from the network first, and restart with the command above only when you're at the tip.

Then, you can use a similar to mine command to see your own node's bandwidth: bitcoin-0.21.1/bin/bitcoin-cli -rpcport=8109 getpeerinfo | grep 'inv\|sketch\|reqrecon\|reqsketchext\|reconcildiff' | awk '/[0-9]+/ {gsub(/[^0-9]/, "", $0); sum+=$0} END {print sum}'.

You can compare this number to the regular pre-Erlay Bitcoin node of the latest release running in parallel to the Erlay node.
The expected result is around 30-55% saving, or 15-30% of bandwidth saved overall.

The text was updated successfully, but these errors were encountered:

0xB10C · 2021-08-19T21:41:31Z

I think you meant to link to the 2021-03-erlay branch, right?

0xB10C · 2021-08-22T11:42:18Z

I've been observing and comparing an erlay-node (naumenkogs/bitcoin@8e7033d) and a master-node (bitcoin/bitcoin@38975ec) for about 48h now.

Both nodes have 8 manual outgoing connections to 143.198.185.21:8201-8208 via addnode and don't connect to other nodes (connect=0) and do not accept incoming connections. To monitor p2p traffic, I'm hooking into the net tracepoints inbound_message and outbound_message with my bitcoind-observer project.

The erlay-node had slightly less (~3%) inbound traffic (as in received message size, not connection direction) compared to the master node. The erlay-node had 61,1% less outbound-traffic (not connection direction) usage compared to the master node. Considering both inbound and outbound traffic, there is a ~20% traffic reduction.

The erlay-node received (+38%) and sent (+46%) more messages than the master-node. Considering both inbound and outbound messages, about 42% more messages.

~~Both in- and outbound INV messages are more frequent with erlay compared to master.~~ Inbound INV messages are more frequent with erlay compared to master, but outbound INV messages are /less/ frequent. (thanks @Rspigler)

However, erlay INV messages are on average smaller than master INV messages.

The erlay node receives about 60 sketch messages per minute or one message per second. With 8 erlay peers I'd have expected 8 sketch messages per second or 480 per minute? (I could totally be missing something, still reading up on Erlay).

The erlay node sends 60 reconcildiff and reqrecon messages per minute. reqsketchext messages are infrequent (The 0 msg/min as shown below is only for the past minute at the time of taking the screenshot).

The dashboard can be found here: https://bitcoind.observer/d/T7FkHfnnk/erlay-node-vs-master-node

I'd be happy to add more stats and will probably dive deeper in the future. Having a metric for INVs per TX would be good from what I understand.

Rspigler · 2021-08-23T00:41:13Z

Both in- and outbound INV messages are more frequent with erlay compared to master.

According to your posted graph, inbound INV messages are more frequent, but outbound INV messages are /less/ frequent.

michaelfolkson · 2021-08-24T08:05:27Z

@0xB10C: Wow, nice work and cool use of tracepoints. So there appears to be quite some variability on bandwidth consumption reduction. I wonder why that is. Is it due to the trade-off tweaking? The original paper quoted a 40 percent figure. One next step would be to test out increasing the number of connections and see if bandwidth stays approximately constant which was the other objective from the paper.

naumenkogs · 2021-08-26T09:49:39Z

@michaelfolkson yeah I think that #7 perfectly describes why we don't see 40% here: we had to make protocol changes since the paper :) The original idea was a bit too optimistic.

W.r.t extra connections, good suggestion to test. To test with 12 connections, one has to make 2 changes:

Recompile core with the following change: static const int MAX_OUTBOUND_FULL_RELAY_CONNECTIONS = 12; in net.h (I don't see there is an easier way to make 12 out manual conns)
Expand bitcoind starting call with -addnode=143.198.185.21:8210 -addnode=143.198.185.21:8211 -addnode=143.198.185.21:8212 -addnode=143.198.185.21:8213.

0xB10C · 2021-08-31T18:09:03Z

Some more stats for the last 7 days (we reset the stats on 2021-08-24 21:30 UTC).

This time without much commentary as the observations are similar to #8 (comment)

I've reset the stats again and now run both nodes with 12 full-relay outbound connections to @naumenkogs erlay nodes. I'm using the ports 8201 till 8213 excluding 8209 and this patch:

--- a/src/net.h
+++ b/src/net.h
@@ -59,9 +59,9 @@ static const unsigned int MAX_PROTOCOL_MESSAGE_LENGTH = 4 * 1000 * 1000;
 /** Maximum length of the user agent string in `version` message */
 static const unsigned int MAX_SUBVERSION_LENGTH = 256;
 /** Maximum number of automatic outgoing nodes over which we'll relay everything (blocks, tx, addrs, etc) */
-static const int MAX_OUTBOUND_FULL_RELAY_CONNECTIONS = 8;
+static const int MAX_OUTBOUND_FULL_RELAY_CONNECTIONS = 12;
 /** Maximum number of addnode outgoing nodes */
-static const int MAX_ADDNODE_CONNECTIONS = 8;
+static const int MAX_ADDNODE_CONNECTIONS = 12;
 /** Maximum number of block-relay-only outgoing connections */
 static const int MAX_BLOCK_RELAY_ONLY_CONNECTIONS = 2;
 /** Maximum number of feeler connections */

naumenkogs · 2021-09-01T15:38:01Z

Yeah so it seems like the savings went from 20% for 8-conns to 30% for 12-conns. This is good, but Erlay actually does better than that.

The thing is that making 12 conns while keeping the in/out % of flooding, and reconciliation frequency (8s for each peer) the same doesn't make sense I think. It reduces latency, but we never asked for this.

To keep the same latency as 8-conn erlay and get better performance, we need to reduce % and increase frequency.
This is also reflected in #7 for 12-conns .
So the patch should be expanded:

root@ubuntu-s-4vcpu-8gb-nyc1-01:~/bitcoin# git diff
diff --git a/src/txreconciliation.cpp b/src/txreconciliation.cpp
index 85c22c0ee..c02852b78 100644
--- a/src/txreconciliation.cpp
+++ b/src/txreconciliation.cpp
@@ -16,8 +16,8 @@ constexpr uint32_t RECON_VERSION = 1;
 /** Static component of the salt used to compute short txids for inclusion in sketches. */
 const std::string RECON_STATIC_SALT = "Tx Relay Salting";
 /** Announce transactions via full wtxid to a limited number of inbound and outbound peers. */
-constexpr double INBOUND_FANOUT_DESTINATIONS_PERCENT = 0.1;
-constexpr double OUTBOUND_FANOUT_DESTINATIONS_PERCENT = 0.25;
+constexpr double INBOUND_FANOUT_DESTINATIONS_PERCENT = 0.02;
+constexpr double OUTBOUND_FANOUT_DESTINATIONS_PERCENT = 0.05;
 /** The size of the field, used to compute sketches to reconcile transactions (see BIP-330). */
 constexpr unsigned int RECON_FIELD_SIZE = 32;
 /**
@@ -53,7 +53,7 @@ constexpr uint16_t Q_PRECISION{(2 << 14) - 1};
  * due to reconciliation metadata (sketch sizes etc.), which would nullify the efficiency.
  * Less frequent reconciliations would introduce high transaction relay latency.
  */
-constexpr std::chrono::microseconds RECON_REQUEST_INTERVAL{8s};
+constexpr std::chrono::microseconds RECON_REQUEST_INTERVAL{12s};
 /**
  * We should keep an interval between responding to reconciliation requests from the same peer,
  * to reduce potential DoS surface.

This patch should be applied on all involved nodes. I'm doing so on my 12 nodes, @0xB10C could you do the same on your side and re-start measuring bandwidth?

Note that doing so from my side makes the whole setting suited for 12-conns. 8-conn comparison experiments become less fair.

0xB10C · 2021-09-01T16:52:05Z

To archive this: Bandwidth and messages after 24h without the patch mentioned in #8 (comment)

Updated my erlay node to use the following patch. Master node still uses this patch #8 (comment).

diff --git a/src/net.h b/src/net.h
index 12d282b85..ae66a4426 100644
--- a/src/net.h
+++ b/src/net.h
@@ -59,9 +59,9 @@ static const unsigned int MAX_PROTOCOL_MESSAGE_LENGTH = 4 * 1000 * 1000;
 /** Maximum length of the user agent string in `version` message */
 static const unsigned int MAX_SUBVERSION_LENGTH = 256;
 /** Maximum number of automatic outgoing nodes over which we'll relay everything (blocks, tx, addrs, etc) */
-static const int MAX_OUTBOUND_FULL_RELAY_CONNECTIONS = 8;
+static const int MAX_OUTBOUND_FULL_RELAY_CONNECTIONS = 12;
 /** Maximum number of addnode outgoing nodes */
-static const int MAX_ADDNODE_CONNECTIONS = 8;
+static const int MAX_ADDNODE_CONNECTIONS = 12;
 /** Maximum number of block-relay-only outgoing connections */
 static const int MAX_BLOCK_RELAY_ONLY_CONNECTIONS = 2;
 /** Maximum number of feeler connections */
diff --git a/src/txreconciliation.cpp b/src/txreconciliation.cpp
index 00e220ecf..0937e2bc4 100644
--- a/src/txreconciliation.cpp
+++ b/src/txreconciliation.cpp
@@ -16,8 +16,8 @@ constexpr uint32_t RECON_VERSION = 1;
 /** Static component of the salt used to compute short txids for inclusion in sketches. */
 const std::string RECON_STATIC_SALT = "Tx Relay Salting";
 /** Announce transactions via full wtxid to a limited number of inbound and outbound peers. */
-constexpr double INBOUND_FANOUT_DESTINATIONS_FRACTION = 0.1;
-constexpr double OUTBOUND_FANOUT_DESTINATIONS_FRACTION = 0.1;
+constexpr double INBOUND_FANOUT_DESTINATIONS_FRACTION = 0.02;
+constexpr double OUTBOUND_FANOUT_DESTINATIONS_FRACTION = 0.05;
 /** The size of the field, used to compute sketches to reconcile transactions (see BIP-330). */
 constexpr unsigned int RECON_FIELD_SIZE = 32;
 /**
@@ -53,7 +53,7 @@ constexpr uint16_t Q_PRECISION{(2 << 14) - 1};
  * due to reconciliation metadata (sketch sizes etc.), which would nullify the efficiency.
  * Less frequent reconciliations would introduce high transaction relay latency.
  */
-constexpr std::chrono::microseconds RECON_REQUEST_INTERVAL{8s};
+constexpr std::chrono::microseconds RECON_REQUEST_INTERVAL{12s};
 /**
  * We should keep an interval between responding to reconciliation requests from the same peer,
  * to reduce potential DoS surface.

naumenkogs · 2021-09-16T10:19:21Z

According to my estimates in #7, we have Z=5/12=0.41, which yields ~67% overall bandwidth savings for the entire tx relay. This is inline with the observations from @0xB10C.

0xB10C · 2021-09-16T12:20:05Z

According to my estimates in #7, we have Z=5/12=0.41, which yields ~67% overall bandwidth savings for the entire tx relay. This is inline with the observations from @0xB10C.

Measurements between 2021-09-01 and 2021-09-13 as screenshots:

naumenkogs · 2022-01-13T08:34:38Z

My nodes are currently pruned, so please sync from the network first, and restart with the command above only when you're at the tip.

kcalvinalvin · 2022-01-19T04:40:04Z

I've set up two separate machines at home for testing. One is running 2021-03-erlay branch and the other is running Bitcoin Core v22.0.0. bitcoin.conf setting for both are like below with rpc settings added on. I don't think it matters but one has prune=550 and the other doesn't.

maxconnections=0
addnode=143.198.185.21:8201
addnode=143.198.185.21:8202
addnode=143.198.185.21:8203
addnode=143.198.185.21:8204
addnode=143.198.185.21:8205
addnode=143.198.185.21:8206
addnode=143.198.185.21:8207
addnode=143.198.185.21:8208

After ~40 hours, I used the command bitcoin-0.21.1/bin/bitcoin-cli -rpcport=8109 getpeerinfo | grep 'inv\|sketch\|reqrecon\|reqsketchext\|reconcildiff' | awk '/[0-9]+/ {gsub(/[^0-9]/, "", $0); sum+=$0} END {print sum}'..

Erlay node is returning 92006977 and Bitcoin Core v22.0.0 is returning 153035821. Seems like the Bitcoin core node is using ~66.3% more bandwidth.

hebasto · 2022-01-19T10:43:15Z

Tested bitcoin/bitcoin#21515 (i.e. 2021-03-erlay) on commit 5728fac4d3d29f64ea811f5978e80dabdc083d87 in comparison with the master branch on bitcoin/bitcoin@d94dc69.

The same patchset which collects and presents data has been applied to both branches. The actually tested branches are: BASE and TEST.

Two nodes, BASE and TEST, were running simultaneously for 89 hours with 8 addnoded connections to Erlay peers only.

"Tx" statistics is related to TX network messages.
"Erlay" statistics is related to newly introduced Erlay protocol network messages.

Results look stable. I mean, with time progress relative data do not change significantly.

Data	BASE	TEST	Diff

Total Received	746 MB	734 MB	98.4%
Total Sent	258 MB	131 MB	50.8%
Total R+S	1004 MB	865 MB	86.2%

~~UPDATE: Reasoning about numbers makes me think that we can increase MAX_OUTBOUND_FULL_RELAY_CONNECTIONS by one only with the current (tested) Erlay settings.~~ -- See the following comment :)

hebasto · 2022-01-21T20:25:33Z

Another test: BASE (8 addnoded Erlay peers) vs TEST (12 addnoded Erlay peers), ~48 hours.

Data	BASE	TEST	Diff

Total Received	459 MB	463 MB	100.9%
Total Sent	153 MB	96 MB	62.7%
Total R+S	612 MB	559 MB	91.3%
Peers	8	12	150.0%

This was referenced Aug 19, 2021

Erlay: bandwidth-efficient transaction relay protocol bitcoin/bitcoin#21515

Closed

Erlay meta-issue: understanding protocol performance #7

Open

naumenkogs mentioned this issue Oct 13, 2023

Erlay Project Tracking bitcoin/bitcoin#28646

Closed

17 tasks

sr-gi mentioned this issue Jun 7, 2024

Erlay Project Tracking bitcoin/bitcoin#30249

Open

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Erlay meta-issue: mainnet testing #8

Erlay meta-issue: mainnet testing #8

naumenkogs commented Aug 19, 2021 •

edited

Loading

0xB10C commented Aug 19, 2021

0xB10C commented Aug 22, 2021 •

edited

Loading

Rspigler commented Aug 23, 2021

michaelfolkson commented Aug 24, 2021

naumenkogs commented Aug 26, 2021 •

edited

Loading

0xB10C commented Aug 31, 2021

naumenkogs commented Sep 1, 2021 •

edited

Loading

0xB10C commented Sep 1, 2021 •

edited

Loading

naumenkogs commented Sep 16, 2021

0xB10C commented Sep 16, 2021

naumenkogs commented Jan 13, 2022

kcalvinalvin commented Jan 19, 2022

hebasto commented Jan 19, 2022 •

edited

Loading

hebasto commented Jan 21, 2022

Erlay meta-issue: mainnet testing #8

Erlay meta-issue: mainnet testing #8

Comments

naumenkogs commented Aug 19, 2021 • edited Loading

0xB10C commented Aug 19, 2021

0xB10C commented Aug 22, 2021 • edited Loading

Rspigler commented Aug 23, 2021

michaelfolkson commented Aug 24, 2021

naumenkogs commented Aug 26, 2021 • edited Loading

0xB10C commented Aug 31, 2021

naumenkogs commented Sep 1, 2021 • edited Loading

0xB10C commented Sep 1, 2021 • edited Loading

naumenkogs commented Sep 16, 2021

0xB10C commented Sep 16, 2021

naumenkogs commented Jan 13, 2022

kcalvinalvin commented Jan 19, 2022

hebasto commented Jan 19, 2022 • edited Loading

hebasto commented Jan 21, 2022

naumenkogs commented Aug 19, 2021 •

edited

Loading

0xB10C commented Aug 22, 2021 •

edited

Loading

naumenkogs commented Aug 26, 2021 •

edited

Loading

naumenkogs commented Sep 1, 2021 •

edited

Loading

0xB10C commented Sep 1, 2021 •

edited

Loading

hebasto commented Jan 19, 2022 •

edited

Loading