Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lightning Specification Meeting 2025/01/27 #1221

Closed
13 of 23 tasks
t-bast opened this issue Jan 22, 2025 · 19 comments
Closed
13 of 23 tasks

Lightning Specification Meeting 2025/01/27 #1221

t-bast opened this issue Jan 22, 2025 · 19 comments

Comments

@t-bast
Copy link
Collaborator

t-bast commented Jan 22, 2025

The meeting will take place on Monday 2025/01/13 at 7pm UTC (5:30am Adelaide time) on Libera Chat IRC #lightning-dev. It is open to the public.

A video link is available for higher bandwidth communication: https://meet.jit.si/Lightning-Spec-Meeting

Recently Updated Proposals / Seeking Review

This section contains changes that have been opened or updated recently and need feedback from the meeting participants.

Stale Proposals

This section contains pending changes that may not need feedback from the meeting participants, unless someone explicitly asks for it during the meeting. These changes are usually waiting for implementation work to happen to drive more feedback.

Waiting for interop

This section contains changes that have been conceptually ACKed and are waiting for at least two implementations to fully interoperate.
They most likely don't need to be covered during the meeting, unless someone asks for updates.

Long Term Updates

This section contains long-term changes that need review, but require a substantial implementation effort.

@t-bast t-bast pinned this issue Jan 22, 2025
@Roasbeef
Copy link
Collaborator

Roasbeef commented Jan 27, 2025

rbf coop close:

  • interop between lnd and eclair
    • mismatched logic on the OP_RETURN usage, lnd fixing their version
    • should be turned around for interop by the EOY

announcement conf follow up:

  • proposes a more conservative interpretation:
    • spec says must not send it before 6 blocks
    • when you recv, maybe you're late a few blocks
  • do impls handle re-org for the network graph?
    • lnd does, will handle block connect+disconnect, to remove re-org'd channels (only on start up)
    • eclair, ldk, cln rely on 6 block aspect to work around in practice
  • main change:
    • nodes should ignore a chan ann if the funding txn doesn't have 6 confs

splice locked reconnection:

  • would like splice reconnection to be atomic
    • some information that's in the chan_reest, some that comes after the chan_reest
    • hard to ignore the issue, taproot requires upfront handling
  • spec doesn't have anything in chan reest, to know your current state and peer's current state
  • proposals:
    • include funding txid of the latest splice
    • include entire message in the TLV

splicing+taproot:

  • extend chan reest:
    • prefix the funding txid it applies to, can then make that a list (funding_txid, nonce)
    • can potentially have this be a different a different TLV to make the switch over sooner
  • eclair wanting to make sure all messages defined properly, for announcement, etc:
    • can do this in the taproot gossip PR
  • is there a limit on the number of active splices?
    • no?
    • eclair has a limit on the number of RBF attempts, needs to increase the fee rate all the time
      • serves to limit the number of pending splices
      • has a config on the number of RBF attempts
      • has a clean reject, one side sends init_rbf, and the other sends tx_abort already one too many RBFs

txn v3 stuff:

  • v3 on HTLC transactions
    • don't want them to be pinnable
    • while we're at it, may as well do zero fee, so basically keep it as is
    • past discussion re not having v3 on HTLC transactions, more context needed
  • to ramp up in full in a few months
  • waiting for bitcoind 29, few months away
    • how do we gauge when it's ready to relay?

bolt 12 contacts:

  • LDK nodes now support the DNSSEC look up feature

storage back up:

  • CLN may start to aggressively send updates
  • 1k limit?
  • eclair has version on master branch, need a conf file to enable

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 29, 2025

@morehouse I re-read https://delvingbitcoin.org/t/lightning-transactions-with-v3-and-ephemeral-anchors/ to refresh my memory and figure out what to do with HTLC txs for v3 commitments, and my proposal is to simply make pre-signed HTLC txs v3 without changing anything else (ie don't introduce pre-signed transactions for spending HTLC outputs from the remote commit, as this creates non-trivial changes to exchange more signatures, for which I think we should wait for PTLCs).

By doing only that, we should have the following properties (please double-check to make sure I'm not missing something):

  • when the remote commit is published, you can immediately spend your HTLC outputs
    • if the commit is unconfirmed, since it is v3, you can only use v3 txs to spend it
    • you can now use your available HTLC outputs or your main output to CPFP the commit tx by using v3 (no need for external wallet inputs) which is really nice
    • if the commit is confirmed, then you can use either v2 or v3
    • note that using v2 here cannot be used as a pinning vector because:
      • the commit tx is confirmed: the only thing you could pin is the HTLC output spend
      • if you have the preimage, you must spend that before the timeout branch is available: why would you pin instead of claiming it?
      • if you don't have the preimage, your peer should have force-closed early enough to spend this output before you can spend it from the timeout path, so you shouldn't have the opportunity to pin
  • when the local commit is published, you can only spend HTLC outputs through pre-signed v3 txs, which means you cannot pin anything
    • as we've seen in the previous section, your peer cannot pin you either
    • note that this includes publishing revoked commitments: you won't be able to pin an HTLC output from a revoked commitment because you'll need to use pre-signed v3 HTLC txs, allowing your peer to claim the HTLC outputs through the penalty path using v3 and evicting your HTLC txs (or they can let your HTLC tx confirm and spend its output through the penalty path, which is an even better punishment because you paid fees for the HTLC tx without gaining anything)

Does that sound like a good plan?

@morehouse
Copy link
Contributor

@t-bast: If all implementations are going to prioritize TRUC channels over the next couple years, it would really be a shame to go through all the effort and not fix pinning, especially when we're this close to it. If we have to wait for PTLCs, the roadmap becomes:

  1. Pinnable TRUC channels
  2. Pinnable taproot TRUC channels
  3. Unpinnable taproot TRUC channels with PTLCs

Do we really want to leave the network vulnerable to theft via pinning for that long?

* when the _remote_ commit is published, you can immediately spend your HTLC outputs
  
  * if the commit is unconfirmed, since it is v3, you can only use v3 txs to spend it
  * you can now use your available HTLC outputs or your main output to CPFP the commit tx by using v3 (no need for external wallet inputs) which is really nice
  * if the commit is confirmed, then you can use either v2 or v3

This all sounds right.

  * note that using v2 here cannot be used as a pinning vector because:
    
    * the commit tx is confirmed: the only thing you could pin is the HTLC output spend

Pinning "only" HTLC outputs is sufficient to steal funds!

    * if you have the preimage, you must spend that before the timeout branch is available: why would you pin instead of claiming it?

Because by pinning you can delay confirmation long enough that the upstream HTLC times out. You can then successfully steal the HTLC by claiming the refund upstream and the preimage downstream.

    * if you don't have the preimage, your peer should have force-closed early enough to spend this output before you can spend it from the timeout path, so you shouldn't have the opportunity to pin

Right. In general HTLC pinning is only effective using the preimage.

* when the _local_ commit is published, you can only spend HTLC outputs through pre-signed v3 txs, which means you cannot pin anything

Right, under your proposal the broadcaster of the commitment transaction cannot pin, but the non-broadcaster can.

  * as we've seen in the previous section, your peer cannot pin you either

They can pin HTLC outputs, as explained above.

  * note that this includes publishing revoked commitments: you won't be able to pin an HTLC output from a revoked commitment because you'll need to use pre-signed v3 HTLC txs, allowing your peer to claim the HTLC outputs through the penalty path using v3 and evicting your HTLC txs (or they can let your HTLC tx confirm and spend its output through the penalty path, which is an even better punishment because you paid fees for the HTLC tx without gaining anything)

Yep, again here the broadcaster of the revoked commitment cannot pin. And the non-broadcaster has no reason to pin because they can claim all funds via the revocation key.

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 29, 2025

Pinning "only" HTLC outputs is sufficient to steal funds!
Because by pinning you can delay confirmation long enough that the upstream HTLC times out. You can then successfully steal the HTLC by claiming the refund upstream and the preimage downstream.

I don't understand this attack. Can you provide a detailed attack scenario where that's exploited? I don't think this works, which is why I don't think there is any pinning risk related to HTLCs.

@morehouse
Copy link
Contributor

Pinning "only" HTLC outputs is sufficient to steal funds!
Because by pinning you can delay confirmation long enough that the upstream HTLC times out. You can then successfully steal the HTLC by claiming the refund upstream and the preimage downstream.

I don't understand this attack. Can you provide a detailed attack scenario where that's exploited? I don't think this works, which is why I don't think there is any pinning risk related to HTLCs.

  1. Mallory sends a payment to herself, with the penultimate hop being Bob. M -> ... -> A -> B -> M
  2. Mallory holds the HTLC until expiry.
  3. Bob force closes the B-M channel.
  4. After Bob's commitment confirms, he broadcasts a v3 HTLC-Timeout to claim the HTLC refund. At the same time, Mallory widely broadcasts and pins a low-feerate v2 preimage claim of the same HTLC. Thus Bob's mempool contains his HTLC-Timeout while all other mempools contain Mallory's pinned v2 preimage claim. Bob's package does not replace Mallory's because it doesn't pay a higher absolute fee (RBF rule 3).
  5. The upstream HTLC expires.
  6. Alice force closes the A-B channel and claims the HTLC via HTLC-Timeout.
  7. After the HTLC-Timeout confirms, Alice fails back her upstream HTLC, thereby refunding the HTLC back to Mallory.
  8. Mallory's pinned v2 preimage claim confirms.

Mallory has succeeded in stealing the the value of the original payment from Bob.

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 30, 2025

Ok, so this is the standard attack that assumes eclipsing the victim in step 4, I was afraid you had found something else. As far as I know, nobody has been able to demonstrate that they can successfully perform this kind of network partitioning. If Bob sees Mallory's preimage claim in his mempool, or if his HTLC-timeout transaction reaches at least some miners, the attack fails. On top of that, the amount that can be stolen is bounded by the max_htlc_value_in_flight_msat parameter, which means that the attacker must ensure a high success probability for the attack to be economically viable.

I honestly don't think this very-hard-to-pull-off attack is worth the extra complexity required for adding half a round-trip to exchange more HTLC signatures during commitment updates, but we'll discuss it during a spec meeting to see what others think. We had already discussed this attack a long time ago, and decided that if we thought it was a credible attack, we could trivially mitigate it by having nodes monitor all preimage claims in their mempool (even those that aren't for their own channels) and share them with their direct peers in ping / pong. But nobody thought it was credible enough to even write the spec for this! But I'd definitely reconsider it if someone demonstrated that they can pull off this kind of attack in a realistic setting (where they initially don't know the IP address of the bitcoin node used by the lightning node they're targeting).

Note that we had indeed already discussed this in the delving post (https://delvingbitcoin.org/t/lightning-transactions-with-v3-and-ephemeral-anchors/418/17).

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented Jan 30, 2025

This kind of network partitioning should be really trivial. Last I saw anyone run numbers, > 50% of the lightning nodes with IP addresses have a Bitcoin full node running on the same IP. If you have that info the attack only requires that you send a transaction directly to that node while connecting to as many other public nodes as possible and sending them other transactions.

Even if you don't have that info, figuring it out by doing previous passes which partition the network as they close channels is likely also very powerful.

On top of that, the amount that can be stolen is bounded by the max_htlc_value_in_flight_msat parameter,

Sadly most nodes default to the full channel value here still. I don't think this is a strong argument :(

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 30, 2025

If you have that info the attack only requires that you send a transaction directly to that node while connecting to as many other public nodes as possible and sending them other transactions.

That's doesn't seem as simple as that to me. Even if you know the target bitcoin core node, you must:

  • successfully connect to it (which means they accept inbound connections): admittedly, this part isn't too hard in most cases
  • broadcast your preimage claim to every peer of the target node
  • but only do that after the target node has added their own HTLC-timeout tx to their mempool (otherwise they would obtain the preimage by one of their peers)
  • but before they've sent their HTLC-timeout tx to any of their peers (otherwise the HTLC-timeout tx is likely to reach some miners)

How do you achieve that? Is there some bitcoin core behavior that I don't know of that makes this easy?

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented Jan 30, 2025

broadcast your preimage claim to every peer of the target node

It doesn't have to be to every peer of the target node, just enough nodes on the network that miners see Mallory's transaction first. I don't see why this isn't trivial.

but only do that after the target node has added their own HTLC-timeout tx to their mempool (otherwise they would obtain the preimage by one of their peers)

Sure, but you're connected to them, you can see when they do this cause they'll tell you.

I don't mean to imply that this attack is 100% successful, but Mallory isn't risking any money in trying it (as long as she learns that Bob's timeout tx confirmed in a timely manner, which she can by looking at the chain), and there's not really any reason to think the success rate would be fairly good.

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 30, 2025

Sure, but you're connected to them, you can see when they do this cause they'll tell you.

Yes but by the time they tell you, they're also telling all of their other bitcoin peers, which means you're already late: the honest node has a head-start to broadcast their HTLC-timeout widely before you can start broadcasting your preimage claim?

It doesn't have to be to every peer of the target node, just enough nodes on the network that miners see Mallory's transaction first.

I honestly don't see how you can achieve that with a good enough success percentage, given the timing requirements described above...that's why I'd love for someone to try it out on mainnet and report back on whether it's working or not!

I don't mean to imply that this attack is 100% successful, but Mallory isn't risking any money in trying it

Mallory needs to create channels for this attack, which has a cost and creates delays in setting it up again, right?


So far I'm still not convinced that this attack is realistic. And even if it is, I think that implementing trivial preimage sharing with your lightning peers fixes it? I'd rather implement this preimage sharing than re-work the full commitment update protocol to allow using pre-signed transactions on remote commit txs, because this is a much more complex change which means that it will take a much longer time to ship 0-fee commitments...

@TheBlueMatt
Copy link
Collaborator

TheBlueMatt commented Jan 30, 2025

Yes but by the time they tell you, they're also telling all of their other bitcoin peers, which means you're already late: the honest node has a head-start to broadcast their HTLC-timeout widely before you can start broadcasting your preimage claim?

There's a nontrivial timelag (on purpose, not just like Bitcoin Core is slow) between when a node receives a transaction and when it broadcasts it, even if the node sends it out to 20 peers, you still have plenty of time to send it to literally every single other (listening) full node on the network before anyone but those 20 nodes see the conflict (and you can probably still beat the transaction to some of the 20, as broadcasts are staggered from the victim).

There's no race here, the honest victim has an additional delay in their relay that you don't have to bother with (and even if that weren't there, the internet is not flat, you can pretty easily win a latency race with a bit of effort).

I honestly don't see how you can achieve that with a good enough success percentage, given the timing requirements described above...that's why I'd love for someone to try it out on mainnet and report back on whether it's working or not!

Further than the fundamental latency issues that totally screw Bob, the vast, vast majority of hashrate is identifiable. If you can identify, lets say, 50% of the nodes of the top 3 pools (you definitely can!) then this attack will work about 31% of the time for you. That's a really huge success rate with relatively minimal effort.

Mallory needs to create channels for this attack, which has a cost and creates delays in setting it up again, right?

Sure, but paying an on-chain fee to set up a 10M sat channel in order to steal 10M sats with some non-zero probability is still pretty damn high positive EV :).

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 30, 2025

There's a nontrivial timelag (on purpose, not just like Bitcoin Core is slow) between when a node receives a transaction and when it broadcasts it, even if the node sends it out to 20 peers, you still have plenty of time to send it to literally every single other (listening) full node on the network before anyone but those 20 nodes see the conflict (and you can probably still beat the transaction to some of the 20, as broadcasts are staggered from the victim).

Got it, that's the Bitcoin Core behavior I was missing that makes the attack easier than I thought, thanks!

If you can identify, lets say, 50% of the nodes of the top 3 pools (you definitely can!) then this attack will work about 31% of the time for you. That's a really huge success rate with relatively minimal effort.

That means the preimage sharing via ping/pong I was referring to wouldn't work, right? Because a clever attacker would send his preimage claim only to miners, so that most of the network sees the HTLC-timeout and only a few miners (but enough to guarantee a non negligible attack success rate) see the preimage claim, so most likely no lightning node will see the preimage claim?

@Crypt-iQ
Copy link
Contributor

Crypt-iQ commented Jan 30, 2025

Funnily enough, I tried to do this the other weekend...

I ran a series of mainnet mempool partitioning tests as a weekend project. The motivation here was to assess in a general way, how easy it would be for a pretty unsophisticated attacker to pull off mempool partitioning. All tests were run on my weak 8gb ram home laptop and the amount of bitcoin involved was $30 total (one $10 deposit and a separate $20 one).

Methodology:

  • I had a file with a list of reachable nodes that accepted inbound connections and participated in addr-relay. This is distinct from an addrman file from bitcoind or btcd. I made this one myself by crawling the network and recording the crawled address in question if it met certain criteria. There are ~5,700 nodes on this list. Note that the vast majority of the bitcoin network is "unreachable" but is still very much crucial for the bitcoin p2p backbone.
  • I had another file with a list of these nodes and how likely they were going to accept an incoming connection from me. From this list, I randomly chose a few and connected to them, assessing their nServices and if they chose me for tx-relay (e.g. started sending INV to me. This is pretty important since if a peer doesn't send INV to you, you can't GETDATA since GETDATA requires at least 1 INV to be sent in modern Core.)
  • I selected three nodes that I'd use as target nodes. Two out of the three had what I call high "connection degree" and seemed pretty centrally positioned in the network.
    • I define "connection degree" as how many inbound connections a node will take before it starts to evict connections. A large percentage of the network has 0 or 1 connection slot available and the other large cohort has > 125 connection slots available. These numbers are similar with the numbers measured in 2022 when addr spam let us measure the degree of p2p nodes (https://ieeexplore.ieee.org/document/9805511). The methodology here can improve and the data can probably be cleaned a bit, but I think the distribution is accurate.
Image
  • After creating two conflicting transactions, I sent txA to the rest of the network (the nodes in the ~5,700-count list) and txB to the target node. These conflict transactions were not able to RBF each other and had a ~25 sat fee difference. I tried to overpay the prevailing fee-rate by a bit.
  • I then sent a GETDATA to the target to assert whether it had actually received txB. If it had txB, things were going well.
  • I then monitored mempool.space for both txn's and observed which one was observed by mempool.space. The conflict transaction that reached mempool.space first was the one that got mined. For some reason, it only ever had one of the transactions (I would expect mempool.space to run multiple mempool nodes).
  • I then checked which transaction confirmed and noted it down. If txA was mined and txB was originally found in the target's mempool, then the partitioning attack had worked.

Results:

  • The attack was 100% successful against node 1 with 6/6 times. This node was not very centrally positioned and mempool.space always had txA and this node always had txB.
  • The attack was 0% successful against a very high "connection degree" node (accepted lots of inbound connections). This failed 3/3 times. In my methodology, I tried to feed the target node and the network their transactions at the same time. However, my hacky + inefficient go program isn't so good at simultaneous network-wide tx-injection and takes 15 seconds rather than only several (it would be fast if I just ran the script on 4 nodes in the cloud simultaneously). So sometimes the target would already know of txA before I had given it txB. Also when it did have txB, so did mempool.space and then this transaction confirmed. This was even despite me spamming the network with txA. Further analysis probably would have revealed close centrality to mining nodes. IIRC this node accepted > 600 incoming connections at a time.
  • The attack was 66% successful against node 3 with 4/6 times. This node was also centrally positioned. However, I believe the failures here were due to bugs in the test harness (testing in real-world conditions has downsides) and not checking whether our connection was evicted (Core evicts connections when ~ 114 inbound connections are made) when trying to push through a transaction.

I think further analysis is needed for mempool partitioning. My test harness is buggy and has some connectivity issues with some of the other nodes since I wrote a hacky client for the p2p network. The data suggests that unsophisticated partitioning (i.e. not identifying mining nodes, no simultaneous network-wide tx injection, etc) is node-dependent / centrality-dependent with variable success. I think by identifying the miners with the methodology in CoinScope Section 5 (https://www.cs.umd.edu/projects/coinscope/coinscope.pdf), I think the numbers can get much better. Potentially just running a node that collects inbound connections could give an attacker edge in tx-relay.

Also, I went through some Core issues and stumbled upon this after the fact bitcoin/bitcoin#30572 which might? fix the method used here of network-spamming to partition.

Additional data:

@Roasbeef
Copy link
Collaborator

Roasbeef commented Jan 30, 2025

After Bob's commitment confirms, he broadcasts a v3 HTLC-Timeout to claim the HTLC refund. At the same time, Mallory widely broadcasts and pins a low-feerate v2 preimage claim of the same HTLC. Thus Bob's mempool contains his HTLC-Timeout while all other mempools contain Mallory's pinned v2 preimage claim. Bob's package does not replace Mallory's because it doesn't pay a higher absolute fee (RBF rule 3).

I thought that as part of the v3 rules, anything that spends from a v3 transaction also needs to be v3?

@morehouse
Copy link
Contributor

I thought at part of the v3 rules, anything that spends from a v3 transaction also needs to be v3?

That's the case only while the parent is unconfirmed. In this attack, Mallory's preimage claim is spending the confirmed commitment transaction, so it is not required to be v3.

@Roasbeef
Copy link
Collaborator

Mallory's preimage claim is spending the confirmed commitment transaction, so it is not required to be v3.

Ahhh, that makes sense. Enforcing that would require an actual consensus change.

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 31, 2025

This is quite annoying, because adding pre-signed transactions for claiming HTLCs from the remote commit requires:

  • changing the channel state machine to revert the signing flow and add half a round-trip, which is significant work
    • this work will be required for PTLCs, but until we actually implement PTLCs it's going to be hard to know if the changes we do now will match what we eventually use for PTLCs (the design space for PTLCs is still quite large and hasn't been explored deeply enough yet to provide a final spec)
    • we wanted to implement option_simple_close before changing anything in the channel state machine
    • reverting the signing flow most likely impacts a lot of unexpected parts of the protocol (e.g. channel_reestablish may need to change in non-trivial ways)
  • this means 0-fee commitments will be using very different code from current commitments, and may also end up being very different from commitments that support PTLCs, creating more technical debt and implementation risks

I would very much prefer a 0-fee commitment type that has minimal changes compared to existing anchor output channels and "simply" switches to v3 with an ephemeral anchor. But on the other hand, it isn't very satisfying that HTLCs can still be pinned until we migrate to PTLCs...

@morehouse

This comment has been minimized.

@instagibbs
Copy link
Contributor

fwiw I was unable to come up with anything nicer myself when I was mulling this over a couple years ago. This is where ANYPREVOUT(or similar) would make things trivial

@t-bast t-bast unpinned this issue Feb 4, 2025
@t-bast t-bast closed this as completed Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants