-
Notifications
You must be signed in to change notification settings - Fork 2
Reconstruct OutboundPayments from ChannelMonitors
#1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This serialization was in place for compatibility with LDK 0.0.101, which should no longer be necessary.
This makes it easier to see what exactly is failing upon test failure.
As part of debugging this test while implementing rebuilding
ChannelManager::outbound_payments from Channel{Monitor}s, I added some extra
checks in this test. They seemed useful for improving readability since this
test follows 3 interleaved payments, so thought it was worth committing the
changes.
We have an overarching goal of getting rid of ChannelManager persistence and rebuilding the whole ChannelManager from existing ChannelMonitors, due to issues when the two structs are out-of-sync on restart. The main issue that can arise is channel force closure. Here we start this process by rebuilding ChannelManager::outbound_payments from the ChannelMonitors.
|
@joostjager not sure how to request you as a reviewer on my fork so tagging you manually for now |
|
|
||
| write_tlv_fields!(writer, { | ||
| (1, pending_outbound_payments_no_retry, required), | ||
| // TLV 1 used to be used for pending_outbound_payments_no_retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message:
which should no longer be necessary.
Explain 'should'?
|
|
||
| #[test] | ||
| fn test_dup_htlc_onchain_doesnt_fail_on_reload() { | ||
| fn test_dup_htlc_onchain_doesnt_fail_on_reload_0() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunate that rust doesn't have sub-tests. The numbering isn't ideal, but trying to come up with something descriptive probably isn't going to work either...
| ) where | ||
| F: Fn(usize, &ChannelMonitorUpdate) -> bool, | ||
| { | ||
| if let Some(chain_monitor) = node.chain_monitor() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let chain_monitor = node.chain_monitor().expect("Missing chain monitor"); ?
| &nodes[0], | ||
| chan_id, | ||
| 2, | ||
| |upd_idx, upd: &ChannelMonitorUpdate| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it simpler to just return the updates, and do this matching afterwards instead of using the closure?
| (3, pending_outbound_payments, option), | ||
| // We now rebuild `OutboundPayments` from the `ChannelMonitor`s as part of getting rid of | ||
| // `ChannelManager` persistence. | ||
| (3, _pending_outbound_payments, option), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be useful to still read this and do a debug comparison with the reconstructed data? To maybe catch things that aren't covered in tests.
| ); | ||
| } | ||
| } | ||
| for (channel_id, monitor) in args.channel_monitors.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context question: the two loops through args.channel_monitors can't be combined because in the first pass the pending_outbounds need to be reconstructed completely?
| // `pending_intercepted_htlcs`, we were apparently not persisted after | ||
| // the monitor was when forwarding the payment. | ||
| decode_update_add_htlcs.retain(|src_outb_alias, update_add_htlcs| { | ||
| for (htlc_source, (htlc, preimage_opt)) in monitor.get_all_current_outbound_htlcs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the is_channel_closed check, I think also more HTLCSource::PreviousHopData htlcs are processed, but this isn't necessary for payment reconstruction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a few lines of comment describing the purposes of the two loops.
| bolt12_invoice, | ||
| session_priv, | ||
| path, | ||
| is_channel_closed, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First the claim was only executed for closed channels, but now for all htlcs. Is that the intention?
| fee_paid_msat, | ||
| bolt12_invoice: bolt12_invoice, | ||
| }, ev_completion_action.take())); | ||
| let duplicate_sent_event = pending_events.iter().find(|(ev, _)| match ev { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment how duplicate sent event can happen?
| } | ||
| // Because we reload outbound payments from the existing `ChannelMonitor`s, and no HTLC exists | ||
| // in those monitors after restart, we will currently forget about the payment and fail to | ||
| // generate a `PaymentFailed` event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What did we say about this? Wasn't this undesirable?
|
Is it really not possible to aim for merging this PR to main? |
Feel free to respond in the discord thread with Matt! |
|
Summarizing that thread here:
With a cfg flag, we wouldn't need to worry about functional regression, and could still avoid merging to a dev branch. For this PR, it is a question of how easy the new and old code can be switched using a cfg flag without too much copy paste? I have the same question for lightningdevkit#4151. |
We have an overarching goal of getting rid of
ChannelManagerpersistence andrebuilding the whole
ChannelManagerfrom existingChannelMonitors, due toissues when the two structs are out-of-sync on restart. The main issue that can
arise is channel force closure.
Here we start this process by rebuilding
ChannelManager::outbound_paymentsfromthe
ChannelMonitors.