Onion messages: add some initial rate limiting #1604

valentinewallace · 2022-07-08T17:34:19Z

~~Based on #1503.~~

In this PR, we add business logic for checking if a peer's outbound buffer has room for onion messages, and if so pulls a number of them from an implementer of a new trait, OnionMessageProvider.

This may take some work to land, so we separate out its changes from the rest of the steps needed before the onion message module can be made public.

See commit message for more details.

~~Blocked on #1660~~

Based on #1683

lightning/src/ln/peer_handler.rs

codecov-commenter · 2022-07-12T04:28:43Z

Codecov Report

Merging #1604 (6d08249) into main (39cede6) will decrease coverage by 0.10%.
The diff coverage is 82.92%.

❗ Current head 6d08249 differs from pull request most recent head ba8dfad. Consider uploading reports for the commit ba8dfad to get more accurate results

@@            Coverage Diff             @@
##             main    #1604      +/-   ##
==========================================
- Coverage   90.90%   90.80%   -0.11%     
==========================================
  Files          85       85              
  Lines       45888    45813      -75     
  Branches    45888    45813      -75     
==========================================
- Hits        41715    41601     -114     
- Misses       4173     4212      +39

Impacted Files	Coverage Δ
lightning/src/chain/channelmonitor.rs	`91.12% <ø> (-0.05%)`	⬇️
lightning/src/ln/msgs.rs	`85.74% <ø> (-0.56%)`	⬇️
lightning/src/ln/wire.rs	`61.99% <0.00%> (-0.93%)`	⬇️
lightning/src/util/events.rs	`39.50% <ø> (ø)`
lightning/src/util/macro_logger.rs	`85.48% <ø> (ø)`
lightning/src/ln/peer_handler.rs	`56.76% <54.16%> (-0.07%)`	⬇️
lightning-invoice/src/lib.rs	`87.39% <87.50%> (ø)`
lightning/src/onion_message/messenger.rs	`89.53% <94.73%> (ø)`
lightning-background-processor/src/lib.rs	`95.20% <100.00%> (ø)`
lightning-net-tokio/src/lib.rs	`77.37% <100.00%> (+0.20%)`	⬆️
... and 17 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

fuzz/src/onion_message.rs

ariard · 2022-08-01T22:42:13Z

lightning/src/ln/peer_handler.rs

@@ -1637,6 +1690,24 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref, CMH: Deref> P

 			for (descriptor, peer_mutex) in peers.iter() {
 				self.do_attempt_write_data(&mut (*descriptor).clone(), &mut *peer_mutex.lock().unwrap());
+
+				// Only see if we have room for onion messages after we've written all channel messages, to
+				// ensure they take priority.


Of course, I agree that channel messages have a higher priority than onion messages, however if we take the subset of the gossips messages, I'm not sure it's true. Reasoning, onion messages paths are likely to be correlated on payment paths, a routing hop might have an incentive to fasten the traffic of those onions as they might be odds of future routing fees. Gossips are just local host resources consumption, with only loose incentives for the hops (e.g let's say the gossip announced a channel update in this other part of the graph). Just a high level idea not sure it's relevant to introduce now that level of fine-grained class of messages priorities.

ariard · 2022-08-01T22:46:47Z

Reply paths are still not added there correct ?

valentinewallace · 2022-08-02T18:36:57Z

Reply paths are still not added there correct ?

Correct, I might add them in a separate PR actually since we tend to be friendlier to merging a lot of small PRs

valentinewallace · 2022-08-03T16:19:48Z

Gonna go ahead and mark this as ready for review and address #1503 (comment) in a dependent follow-up.

lightning/src/ln/msgs.rs

Largely, this adds the boilerplate needed for PeerManager and OnionMessenger to work together on sending and receiving and replaces the stopgaps added in lightningdevkit#1604.

lightning/src/ln/peer_handler.rs

Largely, this adds the boilerplate needed for PeerManager and OnionMessenger to work together on sending and receiving and replaces the stopgaps added in lightningdevkit#1604.

TheBlueMatt

LGTM, basically.

TheBlueMatt · 2022-08-16T17:35:35Z

lightning/src/ln/peer_handler.rs

@@ -303,6 +319,10 @@ const OUTBOUND_BUFFER_LIMIT_READ_PAUSE: usize = 10;
 /// the peer.
 const OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP: usize = OUTBOUND_BUFFER_LIMIT_READ_PAUSE * FORWARD_INIT_SYNC_BUFFER_LIMIT_RATIO;

+/// When the outbound buffer has this many messages, we won't poll for new onion messages for this
+/// peer.
+const OUTBOUND_BUFFER_LIMIT_PAUSE_OMS: usize = 16;


Ooookayyyy, let's see, now that we aren't competing with gossip backfill, let's drop this to 8 - 512KiB per peer was always a bit high. That also means we won't be constantly pausing reading just because we're stuck forwarding some big onion messages. IMO we should also bump the read pause limit to 12 or even 16 to make sure we can still get channel messages out after enqueuing some big gossip messages.

Updated.

IMO we should also bump the read pause limit to 12 or even 16 to make sure we can still get channel messages out after enqueuing some big gossip messages.

Would you like me to add this in this PR? I'll go with 12

Yea let's just do it here. 12 sounds good.

Done in 8261522

lightning/src/ln/peer_handler.rs

lightning/src/ln/wire.rs

lightning/src/util/events.rs

ariard · 2022-08-18T23:38:56Z

lightning/src/util/events.rs

+/// A trait indicating an object may generate onion message send events
+pub trait OnionMessageProvider {
+	/// Gets up to `max_messages` pending onion messages for the peer with the given node id.
+	fn next_onion_messages_for_peer(&self, peer_node_id: PublicKey, max_messages: usize) -> Vec<msgs::OnionMessage>;


Maybe could be called get_next_onion_messages_for_peer(), like we have get_and_clear_pending_msg_events()

I think we'd like to move away from this pattern, per rust conventions: https://github.com/rust-lang/rfcs/blob/master/text/0344-conventions-galore.md#gettersetter-apis

lightning/src/ln/peer_handler.rs

ariard · 2022-08-19T00:06:29Z

lightning/src/ln/peer_handler.rs

+
+	/// Returns the number of onion messages we can fit in this peer's buffer.
+	fn onion_message_buffer_slots_available(&self) -> usize {
+		cmp::min(


Folks, have you thoughts about making those constants configurable with some PerfConfig to adjust to the node operator resources consumption policy ? Even if you have 1000 channels, 500 MiB of memory that can be too much. Rather to adopt an approach where we assume for X channels you have Y MiB per-outbound buffer, we could devolve the choice to the operator, like you have an global Z onion messages outbound buffer to allocate evenly or not (like first-come, first serve).

Further, I think we should also care about ensuring the channel is well-confirmed or coming from a trusted peers, otherwise I could send you a thousand of OpenChannel messages and never broadcast the funding, if we allocate onion messages outbound bandwidth on the flight that could become a DoS Vector. I think same with dry-in up bandwidth for closed channels.

Could be all follow-ups.

ariard

Looks good, just minor comments though see #1604 (comment)

lightning/src/onion_message/messenger.rs

lightning/src/ln/peer_handler.rs

valentinewallace · 2022-08-22T18:24:08Z

FYI, I had to rebase to fix CI.

TheBlueMatt · 2022-08-22T20:05:09Z

lightning/src/ln/peer_handler.rs

+	/// Determines if we should push additional messages onto a peer's outbound buffer for backfilling
+	/// onion messages and gossip data to the peer. This is checked every time the peer's buffer may
+	/// have been drained.
+	fn should_buffer_message_backfill(&self) -> bool {


Hmmmmm right, now I remember why we were queueing onion messages rather than just draining as required - as currently written we let non-backfill gossip starve onion messages because that is buffered.

I think what we should do is (a) separate the should-gossip-backfill check from the should-om-send check, then (b) track how much of the queue is onion messages, (c) always allow N onion messages in the queue (let's say 4?), and reduce the buffer_full_drop_gossip by the same N.

I think that's basically what we want, anyway, we'll basically reserve N/buffer_full_drop_gossip of our outbound bandwidth for onion messages vs gossip (by message count, by bandwidth itself probably more).

Hmmmmm right, now I remember why we were queueing onion messages rather than just draining as required - as currently written we let non-backfill gossip starve onion messages because that is buffered.

I'm confused, currently we treat OMs ~the same as gossip backfill was treated prior to this change. Does this mean gossip backfilling does not currently work as intended in main?

No, it means gossip backfill will be starved by new gossip broadcasts, which makes sense - if we drop new gossip broadcasts we won't re-send them, but if we don't send backfill, we'll just send it later. For OMs, we don't want new gossip broadcasts to starve OMs, though we probably want some ratio between the two cause I'm not sure we want OMs to starve new gossip broadcasts either.

It looks like we only produce new gossip broadcasts when channels open+close, and on timer_tick. So FWIW new gossip broadcasts seem naturally somewhat rate limited, unless the user is specifically spamming the broadcast method

In any case, am I correct that this means we're moving OM sending back to event processing? Because you'd want to only queue->write 1 OM at a time in do_attempt_write_data iiuc

No, we also produce new gossip broadcasts when we receive new gossip messages from peers - we want to forward those on to our peers.

In any case, am I correct that this means we're moving OM sending back to event processing? Because you'd want to only queue->write 1 OM at a time in do_attempt_write_data iiuc

No, I don't think we need to. The above suggestion allows us to keep it in the message flushing logic, it just means we may enqueue an OM even if the buffer is not empty (which gets us a bit back to buffering in PM a bit, but it doesnt matter where we enqueue, really).

Of all the methods currently in "Peer",
this method "should_buffer_message_backfill" is the most deceptive.
iiuc, it means more like "can_send_init_sync_gossip" or "should_send_init_sync_gossip" or "should_buffer_init_sync_gossip"

""Determines if we should push additional gossip messages onto a peer's outbound buffer. This is checked every time the peer's buffer may have been drained.
Note: This is only checked for gossip messages resulting from initial sync. For new gossip broadcasts we either send them immediately on receive or drop them. (acc. to should_drop_forwarding_gossip) ""

Please correct it if this understanding is wrong or it can be better renamed.
Also wanted to remove references to backfill, as it was slightly confusing.

Consider renaming "buffer_full_drop_gossip" => "should_drop_forwarding_gossip", buffer is full is implementation detail

@G8XSU Could you move this comment to #1683? I separated out this code change to that PR

G8XSU · 2022-08-24T20:31:19Z

lightning/src/ln/peer_handler.rs

-	/// messages). Note that some other lightning implementations time-out connections after some
-	/// time if no channel is built with the peer.
+	/// Constructs a new `PeerManager` with the given `RoutingMessageHandler`. No channel message
+	/// handler or onion message handler is used and onion and channel messages will be ignored (or


why would it "(or generate error messages)" If it's an ignoringMsgHandler.
In my brief glance of code, it shouldn't be throwing errors ever.

The chan_handler is an ErroringMessageHandler below

G8XSU · 2022-08-24T22:44:13Z

lightning/src/ln/peer_handler.rs

+								self.enqueue_message(peer, &msg);
+								peer.sync_status = InitSyncTracker::NodesSyncing(msg.contents.node_id);
+							} else {
+								peer.sync_status = InitSyncTracker::NoSyncRequested;


shouldn't this case be unreachable ?

I think since this, #1604 (comment) and #1604 (comment) are pre-existing I'd rather not address them in this PR to keep it small in scope. Maybe we should open an issue for them?

yes definitely, if you provide your thoughts on those things,
then i can go ahead and make those minor changes. :)

Sadly github won't let me see what line this originally referred to. However, I think all the changes you suggested seem reasonable if you want to PR them, and we can discuss further on the PR :)

G8XSU · 2022-08-24T22:46:05Z

lightning/src/ln/peer_handler.rs

+								self.enqueue_message(peer, &msg);
+								peer.sync_status = InitSyncTracker::NodesSyncing(msg.contents.node_id);
+							} else {
+								peer.sync_status = InitSyncTracker::NoSyncRequested;


Consider renaming NoSyncRequested => NoSyncRequired / NoAdditionalSyncRequired
We are currently using it for two purpose:

No Sync Requested

Initial Sync is complete, no more sync required.

Yes, that seems reasonable to me

G8XSU · 2022-08-24T22:46:29Z

lightning/src/ln/peer_handler.rs

+				} else {
+					match peer.sync_status {
+						InitSyncTracker::NoSyncRequested => {},
+						InitSyncTracker::ChannelsSyncing(c) if c < 0xffff_ffff_ffff_ffff => {


consider renaming
c => scid_sync_progress / channel_sync_progress / scid_sync_tracker or something better

G8XSU · 2022-08-24T23:18:14Z

lightning/src/ln/peer_handler.rs

+	/// Determines if we should push additional messages onto a peer's outbound buffer for backfilling
+	/// onion messages and gossip data to the peer. This is checked every time the peer's buffer may
+	/// have been drained.
+	fn should_buffer_message_backfill(&self) -> bool {


Of all the methods currently in "Peer",
this method "should_buffer_message_backfill" is the most deceptive.
iiuc, it means more like "can_send_init_sync_gossip" or "should_send_init_sync_gossip" or "should_buffer_init_sync_gossip"

""Determines if we should push additional gossip messages onto a peer's outbound buffer. This is checked every time the peer's buffer may have been drained.
Note: This is only checked for gossip messages resulting from initial sync. For new gossip broadcasts we either send them immediately on receive or drop them. (acc. to should_drop_forwarding_gossip) ""

Please correct it if this understanding is wrong or it can be better renamed.
Also wanted to remove references to backfill, as it was slightly confusing.

Consider renaming "buffer_full_drop_gossip" => "should_drop_forwarding_gossip", buffer is full is implementation detail

G8XSU · 2022-08-24T23:28:52Z

lightning/src/ln/peer_handler.rs

+				let next_onion_message_opt = if let Some(peer_node_id) = peer.their_node_id {
+					self.message_handler.onion_message_handler.next_onion_message_for_peer(peer_node_id)
+				} else { None };
+
+				// For now, let onion messages starve gossip.
+				if let Some(next_onion_message) = next_onion_message_opt {
+					self.enqueue_message(peer, &next_onion_message);
+				} else {
+					match peer.sync_status {
+						InitSyncTracker::NoSyncRequested => {},
+						InitSyncTracker::ChannelsSyncing(c) if c < 0xffff_ffff_ffff_ffff => {


is there some way to abstract this into send_om and then send_init_sync_gossip (if no om was sent)?

Let me know what you think of the latest version, sorry for all the churn

i think it looks much cleaner now :)

valentinewallace · 2022-08-25T18:42:23Z

Now based on #1683

TheBlueMatt

Thanks for tackling the surprising prereqs here!

jkczyz

LGTM modulo a few minor comments

lightning/src/ln/wire.rs

lightning/src/onion_message/messenger.rs

Adds the boilerplate needed for PeerManager and OnionMessenger to work together, with some corresponding docs and misc updates mostly due to the PeerManager public API changing.

...to make sure we can still get channel messages out after enqueuing some big gossip messages.

In this commit, we check if a peer's outbound buffer has room for onion messages, and if so pulls them from an implementer of a new trait, OnionMessageProvider. Makes sure channel messages are prioritized over OMs, and OMs are prioritized over gossip. The onion_message module remains private until further rate limiting is added.

TheBlueMatt reviewed Jul 10, 2022

View reviewed changes

valentinewallace changed the title ~~Onion messages follow-up~~ OMs: hook up Messenger to PeerManager + add reply paths Jul 11, 2022

valentinewallace mentioned this pull request Jul 11, 2022

Onion message follow-ups #1607

Open

17 tasks

valentinewallace force-pushed the 2022-07-OMs-followup branch from 4036693 to d5fbc04 Compare July 12, 2022 04:05

TheBlueMatt added the blocked on dependent pr label Jul 12, 2022

TheBlueMatt reviewed Jul 25, 2022

View reviewed changes

fuzz/src/onion_message.rs Outdated Show resolved Hide resolved

ariard reviewed Aug 1, 2022

View reviewed changes

valentinewallace changed the title ~~OMs: hook up Messenger to PeerManager + add reply paths~~ OMs: hook up Messenger to PeerManager Aug 2, 2022

valentinewallace force-pushed the 2022-07-OMs-followup branch 4 times, most recently from 7fe75d4 to 92a818f Compare August 3, 2022 16:09

valentinewallace mentioned this pull request Aug 3, 2022

Onion messages: add fuzz testing #1648

Merged

valentinewallace marked this pull request as ready for review August 3, 2022 16:20

valentinewallace changed the title ~~OMs: hook up Messenger to PeerManager~~ Onion messages: forward OMs in PeerManager Aug 3, 2022

valentinewallace added Seeking Code Review and removed blocked on dependent pr labels Aug 3, 2022

jkczyz self-requested a review August 3, 2022 18:52

TheBlueMatt reviewed Aug 3, 2022

View reviewed changes

lightning/src/ln/msgs.rs Show resolved Hide resolved

valentinewallace force-pushed the 2022-07-OMs-followup branch from 92a818f to e9b3293 Compare August 4, 2022 15:26

valentinewallace changed the title ~~Onion messages: forward OMs in PeerManager~~ Onion messages: add some initial rate limiting Aug 4, 2022

valentinewallace mentioned this pull request Aug 4, 2022

Expose onion messages in public API #1650

Merged

TheBlueMatt reviewed Aug 4, 2022

View reviewed changes

lightning/src/ln/peer_handler.rs Outdated Show resolved Hide resolved

valentinewallace force-pushed the 2022-07-OMs-followup branch from e9b3293 to 0b4056f Compare August 4, 2022 20:24

valentinewallace force-pushed the 2022-07-OMs-followup branch from 5a0d6d0 to 5cd92f2 Compare August 16, 2022 17:26

TheBlueMatt reviewed Aug 16, 2022

View reviewed changes

valentinewallace force-pushed the 2022-07-OMs-followup branch 2 times, most recently from 0e2ee89 to 652d54a Compare August 16, 2022 18:33

ariard reviewed Aug 19, 2022

View reviewed changes

G8XSU reviewed Aug 19, 2022

View reviewed changes

lightning/src/onion_message/messenger.rs Outdated Show resolved Hide resolved

lightning/src/ln/peer_handler.rs Outdated Show resolved Hide resolved

valentinewallace force-pushed the 2022-07-OMs-followup branch 3 times, most recently from 746cfc9 to 44edece Compare August 22, 2022 17:51

valentinewallace requested review from TheBlueMatt and jkczyz August 22, 2022 18:23

TheBlueMatt reviewed Aug 22, 2022

View reviewed changes

valentinewallace mentioned this pull request Aug 24, 2022

PeerManager: add separate queue for gossip broadcasts #1683

Merged

1 task

G8XSU reviewed Aug 24, 2022

View reviewed changes

valentinewallace force-pushed the 2022-07-OMs-followup branch from 44edece to 6d08249 Compare August 25, 2022 18:31

valentinewallace force-pushed the 2022-07-OMs-followup branch from 6d08249 to ba8dfad Compare August 25, 2022 21:31

TheBlueMatt previously approved these changes Aug 25, 2022

View reviewed changes

jkczyz reviewed Aug 26, 2022

View reviewed changes

lightning/src/ln/wire.rs Outdated Show resolved Hide resolved

lightning/src/onion_message/messenger.rs Outdated Show resolved Hide resolved

lightning/src/onion_message/messenger.rs Outdated Show resolved Hide resolved

valentinewallace added 4 commits August 26, 2022 19:02

Add boilerplate for sending and receiving onion messages in PeerManager

4adff10

Adds the boilerplate needed for PeerManager and OnionMessenger to work together, with some corresponding docs and misc updates mostly due to the PeerManager public API changing.

PeerManager: bump the read pause limit

544f72b

...to make sure we can still get channel messages out after enqueuing some big gossip messages.

Implement OnionMessageProvider for OnionMessenger

b28783a

valentinewallace dismissed TheBlueMatt’s stale review via f4834de August 26, 2022 23:03

valentinewallace force-pushed the 2022-07-OMs-followup branch from ba8dfad to f4834de Compare August 26, 2022 23:03

TheBlueMatt approved these changes Aug 29, 2022

View reviewed changes

jkczyz approved these changes Aug 29, 2022

View reviewed changes

jkczyz merged commit 99123cd into lightningdevkit:main Aug 29, 2022

Onion messages: add some initial rate limiting #1604

Onion messages: add some initial rate limiting #1604

Conversation

valentinewallace commented Jul 8, 2022 • edited Loading

codecov-commenter commented Jul 12, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

ariard commented Aug 1, 2022

valentinewallace commented Aug 2, 2022

valentinewallace commented Aug 3, 2022

TheBlueMatt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ariard left a comment • edited Loading

Choose a reason for hiding this comment

valentinewallace commented Aug 22, 2022

Choose a reason for hiding this comment

valentinewallace Aug 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valentinewallace commented Aug 25, 2022

TheBlueMatt left a comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

valentinewallace commented Jul 8, 2022 •

edited

Loading

codecov-commenter commented Jul 12, 2022 •

edited

Loading

ariard left a comment •

edited

Loading

valentinewallace Aug 23, 2022 •

edited

Loading