-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Router Optimizations #1799
Router Optimizations #1799
Conversation
Oh, would also be cool to add a fuzzer that demonstrates equivalence between the custom map and a btreemap. |
The previous copy was more than one and a half years old, the lightning network has changed a lot since! As of this commit, performance on my Xeon W-10885M with a SK hynix Gold P31 storing a BTRFS volume is as follows: ``` test ln::channelmanager::bench::bench_sends ... bench: 5,896,492 ns/iter (+/- 512,421) test routing::gossip::benches::read_network_graph ... bench: 1,645,740,604 ns/iter (+/- 47,611,514) test routing::gossip::benches::write_network_graph ... bench: 234,870,775 ns/iter (+/- 8,301,775) test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 166,155,032 ns/iter (+/- 30,206,162) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 136,843,661 ns/iter (+/- 67,111,218) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 52,954,598 ns/iter (+/- 11,360,547) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 37,598,126 ns/iter (+/- 17,262,519) test bench::bench_sends ... bench: 37,760,922 ns/iter (+/- 5,179,123) test bench::bench_reading_full_graph_from_file ... bench: 25,615 ns/iter (+/- 1,149) ```
Historically we've had various bugs in keeping the `lowest_inbound_channel_fees` field in `NodeInfo` up-to-date as we go. This leaves the A* routing less efficient as it can't prune hops as aggressively. In order to get accurate benchmarks, this commit updates the minimum-inbound-fees field on load. This is not the most efficient way of doing so, but suffices for fetching benchmarks and will be removed in the coming commits. Note that this is *slower* than the non-updating version in the previous commit. While I haven't dug into this incredibly deeply, the graph snapshot in use has min-fee info for only 9,618 of 20,818 nodes. Thus, it is my guess that with the graph snapshot as-is the branch predictor is able to largely remove the A* heuristic lookups, but with this change it is forced to wait for A* heuristic map lookups to complete, causing a performance regression. ``` test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 182,980,059 ns/iter (+/- 32,662,047) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 151,170,457 ns/iter (+/- 75,351,011) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 58,187,277 ns/iter (+/- 11,606,440) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 41,210,193 ns/iter (+/- 18,103,320) ```
87bc732
to
e572ae7
Compare
Codecov ReportBase: 90.71% // Head: 90.77% // Increases project coverage by
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #1799 +/- ##
==========================================
+ Coverage 90.71% 90.77% +0.06%
==========================================
Files 97 99 +2
Lines 50677 51701 +1024
Branches 50677 51701 +1024
==========================================
+ Hits 45971 46933 +962
- Misses 4706 4768 +62
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
8682115
to
20bedad
Compare
Cleaned up the commit messages, added benchmarks that demonstrate the performance advantages, and added a fuzzer that should catch and issues in the new map implementation. |
c1efa29
to
3f91255
Compare
3f91255
to
f77ad1b
Compare
} | ||
|
||
/// Returns an iterator which iterates over the `key`/`value` pairs in a random order. | ||
pub fn unordered_iter(&self) -> impl Iterator<Item = (&K, &V)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the benefit of a random order? I thought the entire point of this data structure was that the iteration order be deterministic? Should the doc comment be updated to reflect that this actually returns a deterministically sorted list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't need the ordering it's much more efficient. Most uses don't care about the order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But won't it technically always return an ordered version? Considering that, at least in this commit, the underlying structure is a BTreeMap?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in the next commit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that's… a good thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I get your question? The first commit changes callsites to make explicit whether they're relying on getting things in-order or not, the second commit actually changes the backing datastructure. Just because something happens to be in-order doesn't mean a caller can rely on it if the API contract clearly indicates they cant.
@@ -18,15 +20,18 @@ use core::ops::RangeBounds; | |||
/// actually backed by a `HashMap`, with some additional tracking to ensure we can iterate over | |||
/// keys in the order defined by [`Ord`]. | |||
#[derive(Clone, PartialEq, Eq)] | |||
pub struct IndexedMap<K: Ord, V> { | |||
map: BTreeMap<K, V>, | |||
pub struct IndexedMap<K: Hash + Ord, V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a separate commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might make more sense to introduce IndexedMap as the desired type from the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It breaks it up a bit to be a tiny bit easier to review? Makes the commit that adds the new data structure implementation a freestanding commit.
use crate::utils::test_logger; | ||
|
||
// Note that while we take the trees by &mut here | ||
fn check_eq(btree: &BTreeMap<u8, u8>, indexed: &IndexedMap<u8, u8>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks super useful. You may wanna add that to an IndexedMap test_util perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no IndexedMap
test_util? Are you suggesting/requesting additional tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Saw similar improvements on my hardware.
f77ad1b
to
158a3f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, feel free to squash
As evidenced by the previous commit, it appears our A* router does worse than a more naive approach. This isn't super surpsising, as the A* heuristic calculation requires a map lookup, which is relatively expensive. ``` test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 169,991,943 ns/iter (+/- 30,838,048) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 122,144,987 ns/iter (+/- 61,708,911) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 48,546,068 ns/iter (+/- 10,379,642) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 32,898,557 ns/iter (+/- 14,157,641) ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM.
Some questions/suggestions, feel free to squash if you decide not to tackle them.
592d3ee
to
78ac11e
Compare
Squashed, updated the docs trivially, and added a commit to clean up a few more things in the router:
|
2173280
to
af8510f
Compare
Rewrote the last commit to do more like what @tnull suggested, taking advantage of the CMOVs that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM from my side.
a1e465f
to
dca4b77
Compare
Our network graph has to be iterable in a deterministic order and with the ability to iterate over a specific range. Thus, historically, we've used a `BTreeMap` to do the iteration. This is fine, except our map needs to also provide high performance lookups in order to make route-finding fast. Sadly, `BTreeMap`s are quite slow due to the branching penalty. Here we replace the `BTreeMap`s in the scorer with a dummy wrapper. In the next commit the internals thereof will be replaced with a `HashMap`-based implementation.
Our network graph has to be iterable in a deterministic order and with the ability to iterate over a specific range. Thus, historically, we've used a `BTreeMap` to do the iteration. This is fine, except our map needs to also provide high performance lookups in order to make route-finding fast. Sadly, `BTreeMap`s are quite slow due to the branching penalty. Here we replace the implementation of our `IndexedMap` with a `HashMap` to store the elements itself and a `BTreeSet` to store the keys set in sorted order for iteration. As of this commit on the same hardware as the above few commits, the benchmark results are: ``` test routing::router::benches::generate_mpp_routes_with_probabilistic_scorer ... bench: 109,544,993 ns/iter (+/- 27,553,574) test routing::router::benches::generate_mpp_routes_with_zero_penalty_scorer ... bench: 81,164,590 ns/iter (+/- 55,422,930) test routing::router::benches::generate_routes_with_probabilistic_scorer ... bench: 34,726,569 ns/iter (+/- 9,646,345) test routing::router::benches::generate_routes_with_zero_penalty_scorer ... bench: 22,772,355 ns/iter (+/- 9,574,418) ```
Often when we call `compute_fees` we really just want it to saturate and we deal with `u64::max_value` later. In that case, we're much better off doing the saturating in the `compute_fees` as it can use CMOVs rather than branching at each step and then `unwrap_or`ing at the callsite.
dca4b77
to
bde841e
Compare
Fixed the doc comment in an intermediary commit without changing the full diff. |
0.0.114 - Mar 3, 2023 - "Faster Async BOLT12 Retries" API Updates =========== * `InvoicePayer` has been removed and its features moved directly into `ChannelManager`. As such it now requires a simplified `Router` and supports `send_payment_with_retry` (and friends). `ChannelManager::retry_payment` was removed in favor of the automated retries. Invoice payment utilities in `lightning-invoice` now call the new code (lightningdevkit#1812, lightningdevkit#1916, lightningdevkit#1929, lightningdevkit#2007, etc). * `Sign`/`BaseSign` has been renamed `ChannelSigner`, with `EcdsaChannelSigner` split out in anticipation of future schnorr/taproot support (lightningdevkit#1967). * The catch-all `KeysInterface` was split into `EntropySource`, `NodeSigner`, and `SignerProvider`. `KeysManager` implements all three (lightningdevkit#1910, lightningdevkit#1930). * `KeysInterface::get_node_secret` is now `KeysManager::get_node_secret_key` and is no longer required for external signers (lightningdevkit#1951, lightningdevkit#2070). * A `lightning-transaction-sync` crate has been added which implements keeping LDK in sync with the chain via an esplora server (lightningdevkit#1870). Note that it can only be used on nodes that *never* ran a previous version of LDK. * `Score` is updated in `BackgroundProcessor` instead of via `Router` (lightningdevkit#1996). * `ChainAccess::get_utxo` (now `UtxoAccess`) can now be resolved async (lightningdevkit#1980). * BOLT12 `Offer`, `InvoiceRequest`, `Invoice` and `Refund` structs as well as associated builders have been added. Such invoices cannot yet be paid due to missing support for blinded path payments (lightningdevkit#1927, lightningdevkit#1908, lightningdevkit#1926). * A `lightning-custom-message` crate has been added to make combining multiple custom messages into one enum/handler easier (lightningdevkit#1832). * `Event::PaymentPathFailure` is now generated for failure to send an HTLC over the first hop on our local channel (lightningdevkit#2014, lightningdevkit#2043). * `lightning-net-tokio` no longer requires an `Arc` on `PeerManager` (lightningdevkit#1968). * `ChannelManager::list_recent_payments` was added (lightningdevkit#1873). * `lightning-background-processor` `std` is now optional in async mode (lightningdevkit#1962). * `create_phantom_invoice` can now be used in `no-std` (lightningdevkit#1985). * The required final CLTV delta on inbound payments is now configurable (lightningdevkit#1878) * bitcoind RPC error code and message are now surfaced in `block-sync` (lightningdevkit#2057). * Get `historical_estimated_channel_liquidity_probabilities` was added (lightningdevkit#1961). * `ChannelManager::fail_htlc_backwards_with_reason` was added (lightningdevkit#1948). * Macros which implement serialization using TLVs or straight writing of struct fields are now public (lightningdevkit#1823, lightningdevkit#1976, lightningdevkit#1977). Backwards Compatibility ======================= * Any inbound payments with a custom final CLTV delta will be rejected by LDK if you downgrade prior to receipt (lightningdevkit#1878). * `Event::PaymentPathFailed::network_update` will always be `None` if an 0.0.114-generated event is read by a prior version of LDK (lightningdevkit#2043). * `Event::PaymentPathFailed::all_paths_removed` will always be false if an 0.0.114-generated event is read by a prior version of LDK. Users who rely on it to determine payment retries should migrate to `Event::PaymentFailed`, in a separate release prior to upgrading to LDK 0.0.114 if downgrading is supported (lightningdevkit#2043). Performance Improvements ======================== * Channel data is now stored per-peer and channel updates across multiple peers can be operated on simultaneously (lightningdevkit#1507). * Routefinding is roughly 1.5x faster (lightningdevkit#1799). * Deserializing a `NetworkGraph` is roughly 6x faster (lightningdevkit#2016). * Memory usage for a `NetworkGraph` has been reduced substantially (lightningdevkit#2040). * `KeysInterface::get_secure_random_bytes` is roughly 200x faster (lightningdevkit#1974). Bug Fixes ========= * Fixed a bug where a delay in processing a `PaymentSent` event longer than the time taken to persist a `ChannelMonitor` update, when occurring immediately prior to a crash, may result in the `PaymentSent` event being lost (lightningdevkit#2048). * Fixed spurious rejections of rapid gossip sync data when the graph has been updated by other means between gossip syncs (lightningdevkit#2046). * Fixed a panic in `KeysManager` when the high bit of `starting_time_nanos` is set (lightningdevkit#1935). * Resolved an issue where the `ChannelManager::get_persistable_update_future` future would fail to wake until a second notification occurs (lightningdevkit#2064). * Resolved a memory leak when using `ChannelManager::send_probe` (lightningdevkit#2037). * Fixed a deadlock on some platforms at least when using async `ChannelMonitor` updating (lightningdevkit#2006). * Removed debug-only assertions which were reachable in threaded code (lightningdevkit#1964). * In some cases when payment sending fails on our local channel retries no longer take the same path and thus never succeed (lightningdevkit#2014). * Retries for spontaneous payments have been fixed (lightningdevkit#2002). * Return an `Err` if `lightning-persister` fails to read the directory listing rather than panicing (lightningdevkit#1943). * `peer_disconnected` will now never be called without `peer_connected` (lightningdevkit#2035) Security ======== 0.0.114 fixes several denial-of-service vulnerabilities which are reachable from untrusted input from channel counterparties or in deployments accepting inbound connections or channels. It also fixes a denial-of-service vulnerability in rare cases in the route finding logic. * The number of pending un-funded channels as well as peers without funded channels is now limited to avoid denial of service (lightningdevkit#1988). * A second `channel_ready` message received immediately after the first could lead to a spurious panic (lightningdevkit#2071). This issue was introduced with 0conf support in LDK 0.0.107. * A division-by-zero issue was fixed in the `ProbabilisticScorer` if the amount being sent (including previous-hop fees) is equal to a channel's capacity while walking the graph (lightningdevkit#2072). The division-by-zero was introduced with historical data tracking in LDK 0.0.112. In total, this release features 130 files changed, 21457 insertions, 10113 deletions in 343 commits from 18 authors, in alphabetical order: * Alec Chen * Allan Douglas R. de Oliveira * Andrei * Arik Sosman * Daniel Granhão * Duncan Dean * Elias Rohrer * Jeffrey Czyz * John Cantrell * Kurtsley * Matt Corallo * Max Fang * Omer Yacine * Valentine Wallace * Viktor Tigerström * Wilmer Paulino * benthecarman * jurvis
After discussion in #1722 we realized the A* stuff these days is entirely useless (thanks ZFR!), so its best to remove it. While we're at it, this also swaps our stupid BTree lookups for a HashMap, but keeps a sorted keys list for outbound gossip sync.
This gets us halfway to #1473, with a TODO to investigate swapping the BTreeSet for a sorted vec, which I have a strong feeling will be faster (and way more space-effecient!).
This is totally up-for-grabs - it needs documentation, real commit messages, benchmarks, etc. If no one else does it I'll pick it up eventually but this should be a nice improvement.Supersedes #1722.