Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discovery: ensure we prioritize sending out our own local announcements #7239

Merged
merged 6 commits into from
Dec 16, 2022

Conversation

Roasbeef
Copy link
Member

@Roasbeef Roasbeef commented Dec 7, 2022

In this commit, we modify our gossip broadcast logic to ensure that we
always will send out our own gossip messages regardless of the
filtering/feature policies of the peer.

Before this commit, it was possible that when we went to broadcast an
announcement, none of our peers actually had us as a syncer peer (lnd
terminology). In this case, the FilterGossipMsg function wouldn't do
anything, as they don't have an active timestamp filter set. When we go
to them merge the syncer map, we'd add all these peers we didn't send
to, meaning we would skip them when it came to broadcast time.

In this commit, we now split things into two phases: we'll broadcast
our own announcements to all our peers, but then do the normal
filtering and chunking for the announcements we got from a remote peer.

Fixes #6531
Fixes #7223
Fixes #7073
Fixes #6128
Fixes #6924

@Roasbeef
Copy link
Member Author

Roasbeef commented Dec 7, 2022

This PR is a logical follow up of this older PR that introduced the new logic in this area. However it overlooked that we want to ensure we always send out our own updates, as otherwise emergent peer configuration might tend to us never really sending out our own updates.

Also re the first commit, I think we might want to make this package into a module so we can house all the generic stuff there: https://github.com/lightninglabs/taro/tree/main/chanutils. For now I just have a new small package w/ the func I would've reached for in that codebase.

@Roasbeef Roasbeef force-pushed the gossip-scoping-issues branch 2 times, most recently from 41fae24 to 909d5d1 Compare December 7, 2022 02:53
@Roasbeef
Copy link
Member Author

Roasbeef commented Dec 7, 2022

Some other useful crumb-trails:

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't know you'd work on this😂 I came up with a different solution yesterday, will make a PR to illustrate. Basically I think we can just merge sync peers when the peers have sent messages successfully.

discovery/gossiper.go Outdated Show resolved Hide resolved
discovery/gossiper.go Show resolved Hide resolved
Copy link
Collaborator

@Crypt-iQ Crypt-iQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think this patch is a little more complex than #7240

@Roasbeef Roasbeef removed the request for review from ellemouton December 7, 2022 20:11
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one can go first, then I'll rebase on it. Linter needs to be fixed, along with the unit tests and missing a release note, otherwise LGTM👍

discovery/gossiper.go Outdated Show resolved Hide resolved
discovery/gossiper.go Show resolved Hide resolved
@Roasbeef Roasbeef force-pushed the gossip-scoping-issues branch from 909d5d1 to 95754f8 Compare December 9, 2022 01:54
@Roasbeef
Copy link
Member Author

Roasbeef commented Dec 9, 2022

Should be ready for another round now, fixed that test failure as well. The new logic is correct, but realized that in some areas we don't properly set the isRemote bool when we're re-wrapping a processed announcement.

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Now missing a release note, and see this itest failure,

harness_assertion.go:1637: 
        	Error Trace:	/home/runner/work/lnd/lnd/lntest/itest/harness_assertion.go:1637
        	            				/home/runner/work/lnd/lnd/lntest/itest/lnd_network_test.go:140
        	            				/home/runner/work/lnd/lnd/lntest/itest/wait.go:51
        	            				/home/runner/work/lnd/lnd/lntest/itest/wait.go:27
        	            				/home/runner/work/lnd/lnd/lntest/itest/asm_amd64.s:1594
        	Error:      	Received unexpected error:
        	            	timeout waiting for num node updates, want 1, got 0
        	Test:       	TestLightningNetworkDaemonTemp/tranche02/38-of-67/bitcoind/reconnect_after_ip_change
        	Messages:   	failed to assert num of channel updates

@yyforyongyu
Copy link
Member

After checking the logs, it seems that when processing node ann, the validation barrier is not working sometimes. I think we can merge #7186 to get the validation barrier fixed, then rebase on that so we can use a more detailed debug logs.

@Roasbeef Roasbeef force-pushed the gossip-scoping-issues branch from 95754f8 to 404e987 Compare December 14, 2022 03:50
@Roasbeef
Copy link
Member Author

Rebased!

@Roasbeef
Copy link
Member Author

I think something is up w/ the linter, getting a ton of failures for things unrelated to the PR: https://github.com/lightningnetwork/lnd/actions/runs/3691582853/jobs/6249703088

@viaj3ro
Copy link

viaj3ro commented Dec 15, 2022

does it also fix this issue:
#6128

@yyforyongyu
Copy link
Member

@viaj3ro we've made several gossip-related fixes lately so yeah that should be fixed.

@yyforyongyu
Copy link
Member

linter is fixed! I cherry-pick the commits here and build it #6825, will see if the builds pass.

We start with a simple Map function that can be useful for transforming
objects on the fly.

We may want to eventually make this Taro pakage into a module so we can
just have everything in one place:
https://github.com/lightninglabs/taro/tree/main/chanutils.
This lets us keep track of which messages we created ourselves vs the
messages that originated remotely from a peer.
This struct will be used to later on do filtering when we go to
broadcast, based on if a message was locally sourced or granted from a
remote peer.
In this commit, we modify our gossip broadcast logic to ensure that we
always will send out our own gossip messages regardless of the
filtering/feature policies of the peer.

Before this commit, it was possible that when we went to broadcast an
announcement, none of our peers actually had us as a syncer peer (lnd
terminology). In this case, the FilterGossipMsg function wouldn't do
anything, as they don't have an active timestamp filter set. When we go
to them merge the syncer map, we'd add all these peers we didn't send
to, meaning we would skip them when it came to broadcast time.

In this commit, we now split things into two phases: we'll broadcast
_our_ own announcements to all our peers, but then do the normal
filtering and chunking for the announcements we got from a remote peer.

Fixes lightningnetwork#6531
Fixes lightningnetwork#7223
Fixes lightningnetwork#7073
@Roasbeef Roasbeef force-pushed the gossip-scoping-issues branch from 404e987 to 050db16 Compare December 15, 2022 19:57
@Roasbeef
Copy link
Member Author

Ok rebased on top of master, watching the build...

@Roasbeef
Copy link
Member Author

@yyforyongyu PTAL

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM🎉 Thanks for the fix!

@Roasbeef Roasbeef merged commit de3e0d7 into lightningnetwork:master Dec 16, 2022
yyforyongyu added a commit to yyforyongyu/lnd that referenced this pull request Feb 17, 2023
This commit changes the sending of anns from using separate goroutines
to always sending both local and remote announcements in the same
goroutine. In addition, the local announcements are always sent first.
This change is to fix the following case:

1. Alice and Bob have a channel
2. Alice receives Bob's NodeAnnouncement
3. Alice goes to broadcast the channel
4. The broadcast is split into a local and remote broadcast due to PR
   lightningnetwork#7239. Bob's NodeAnnouncement is in the remote batch. Everything else
   (ChannelAnnouncement, ChannelUpdate x2, and Alice's NodeAnnouncement)
   is in the local batch.
5. The remote batch (containing Bob's NodeAnnouncement) runs before the
   local batch since they are spawned in separate goroutines. This means
   that Alice sends Carol the NodeAnnouncement before Carol knows of the
   channel.

In step 2), Bob's NodeAnnouncement (isRemote = true) replaces Bob's
NodeAnnouncement that Alice was going to relay (isRemote = false) after
processing the AnnouncementSignatures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants