Reduce historical bucket size from 16 bits to 13 bits

Utilizing the results of probes sent once a minute to a random node in the network for a random amount (within a reasonable range), we were able to analyze the accuracy of our resulting success probability estimation with various PDFs across the historical and live-bounds models. For each candidate PDF (as well as other parameters, including the histogram bucket weight), we used the `min_zero_implies_no_successes` fudge factor in `success_probability` as well as a total probability multiple fudge factor to get both the historical success model and the a priori model to be neither too optimistic nor too pessimistic (as measured by the relative log-loss between succeeding and failing hops in our sample data). We then compared the resulting log-loss for the historical success model and selected the candidate PDF with the lowest log-loss, skipping a few candidates with similar resulting log-loss but with more extreme constants (such as a power of 11 with a higher `min_zero_implies_no_successes` penalty). Somewhat surprisingly (to me at least), the (fairly strongly) preferred model was one where the bucket weights in the historical histograms are exponentiated. In the current design, the weights are effectively squared as we multiply the minimum- and maximum- histogram buckets together before adding the weight*probabilities together. In the next commit, we'll square the weights yet again before the addition. However, as we do so we quickly run low on bits in our fixed-point arithmetic. We have 16-bit buckets, which when raised to the 4th can fully fill a 64-bit int. Additionally, when looking at the 0th min-bucket we occasionally add up to 32 weights together before multiplying by the probability, requiring an additional five bits. Here we prepare for this by storing a bit fewer states in each historical histogram bucket - 13 bits instead of 16, with a 4-bit fixed-point scheme instead of a 5-bit one. This reduces the number of payment attempts we can realistically track but in exchange allows us to
TheBlueMatt · Dec 18, 2024 · 9c56d3f · 9c56d3f
1 parent ea5180c
commit 9c56d3f
Showing 1 changed file with 41 additions and 22 deletions.
diff --git a/lightning/src/routing/scoring.rs b/lightning/src/routing/scoring.rs
@@ -937,9 +937,9 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref> ProbabilisticScorer<G, L> whe
 	/// is calculated by dividing that bucket's value with the total value of all buckets.
 	///
 	/// For example, using a lower bucket count for illustrative purposes, a value of
-	/// `[0, 0, 0, ..., 0, 32]` indicates that we believe the probability of a bound being very
+	/// `[0, 0, 0, ..., 0, 16]` indicates that we believe the probability of a bound being very
 	/// close to the channel's capacity to be 100%, and have never (recently) seen it in any other
-	/// bucket. A value of `[31, 0, 0, ..., 0, 0, 32]` indicates we've seen the bound being both
+	/// bucket. A value of `[15, 0, 0, ..., 0, 0, 16]` indicates we've seen the bound being both
 	/// in the top and bottom bucket, and roughly with similar (recent) frequency.
 	///
 	/// Because the datapoints are decayed slowly over time, values will eventually return to
@@ -1709,43 +1709,52 @@ mod bucketed_history {
 		buckets: [u16; 32],
 	}
 
-	/// Buckets are stored in fixed point numbers with a 5 bit fractional part. Thus, the value
-	/// "one" is 32, or this constant.
-	pub const BUCKET_FIXED_POINT_ONE: u16 = 32;
+	/// Buckets are stored in fixed point numbers with a 4 bit fractional part. Thus, the value
+	/// "one" is 16, or this constant.
+	pub const BUCKET_FIXED_POINT_ONE: u16 = 16;
 
 	impl HistoricalBucketRangeTracker {
 		pub(super) fn new() -> Self { Self { buckets: [0; 32] } }
 		fn track_datapoint(&mut self, liquidity_offset_msat: u64, capacity_msat: u64) {
 			// We have 32 leaky buckets for min and max liquidity. Each bucket tracks the amount of time
-			// we spend in each bucket as a 16-bit fixed-point number with a 5 bit fractional part.
+			// we spend in each bucket as a 13-bit fixed-point number with a 4 bit fractional part.
 			//
-			// Each time we update our liquidity estimate, we add 32 (1.0 in our fixed-point system) to
+			// Each time we update our liquidity estimate, we add 16 (1.0 in our fixed-point system) to
 			// the buckets for the current min and max liquidity offset positions.
 			//
-			// We then decay each bucket by multiplying by 2047/2048 (avoiding dividing by a
-			// non-power-of-two). This ensures we can't actually overflow the u16 - when we get to
-			// 63,457 adding 32 and decaying by 2047/2048 leaves us back at 63,457.
+			// We then decay each bucket by multiplying by 511/512 (avoiding dividing by a
+			// non-power-of-two). This ensures we can't actually overflow the u13 - when we get to
+			// 8,176 adding 16 and decaying by 511/512 leaves us back at 8,176.
 			//
-			// In total, this allows us to track data for the last 8,000 or so payments across a given
-			// channel.
+			// In total, this allows us to track data for the last 1,000 or so payments across a given
+			// channel per bucket.
 			//
-			// These constants are a balance - we try to fit in 2 bytes per bucket to reduce overhead,
-			// and need to balance having more bits in the decimal part (to ensure decay isn't too
-			// non-linear) with having too few bits in the mantissa, causing us to not store very many
-			// datapoints.
+			// These constants are a balance - we try to fit in 2 bytes per bucket to reduce
+			// overhead and must fit in 13 bits to allow us to square bucket weights without
+			// overflowing into a 128-bit integer to track total points. We also need to balance
+			// having more bits in the decimal part (to ensure decay isn't too non-linear) with
+			// having too few bits in the mantissa, causing us to not store very many datapoints.
 			//
 			// The constants were picked experimentally, selecting a decay amount that restricts us
 			// from overflowing buckets without having to cap them manually.
 
 			let pos: u16 = amount_to_pos(liquidity_offset_msat, capacity_msat);
 			if pos < POSITION_TICKS {
 				for e in self.buckets.iter_mut() {
-					*e = ((*e as u32) * 2047 / 2048) as u16;
+					*e = ((*e as u32) * 511 / 512) as u16;
+					debug_assert!(*e < (1 << 13));
 				}
 				let bucket = pos_to_bucket(pos);
 				self.buckets[bucket] = self.buckets[bucket].saturating_add(BUCKET_FIXED_POINT_ONE);
 			}
 		}
+
+		pub(crate) fn normalize_from_sixteen_bits(mut self) -> Self {
+			for e in self.buckets.iter_mut() {
+				*e /= 8;
+			}
+			self
+		}
 	}
 
 	impl_writeable_tlv_based!(HistoricalBucketRangeTracker, { (0, buckets, required) });
@@ -1908,7 +1917,7 @@ mod bucketed_history {
 				assert_eq!(total_valid_points_tracked, actual_valid_points_tracked as f64);
 			}
 
-			// If the total valid points is smaller than 1.0 (i.e. 32 in our fixed-point scheme),
+			// If the total valid points is smaller than 1.0 (i.e. 16 in our fixed-point scheme),
 			// treat it as if we were fully decayed.
 			const FULLY_DECAYED: f64 = BUCKET_FIXED_POINT_ONE as f64 * BUCKET_FIXED_POINT_ONE as f64 *
 				BUCKET_FIXED_POINT_ONE as f64 * BUCKET_FIXED_POINT_ONE as f64;
@@ -2024,9 +2033,11 @@ impl Writeable for ChannelLiquidity {
 			(2, self.max_liquidity_offset_msat, required),
 			// 3 was the max_liquidity_offset_history in octile form
 			(4, self.last_updated, required),
-			(5, self.liquidity_history.writeable_min_offset_history(), required),
-			(7, self.liquidity_history.writeable_max_offset_history(), required),
+			// 5 was used for min_liquidity_history when the buckets were 16 bits rather than 13
+			// 7 was used for max_liquidity_history when the buckets were 16 bits rather than 13
 			(9, self.offset_history_last_updated, required),
+			(11, self.liquidity_history.writeable_min_offset_history(), required),
+			(13, self.liquidity_history.writeable_max_offset_history(), required),
 		});
 		Ok(())
 	}
@@ -2039,6 +2050,8 @@ impl Readable for ChannelLiquidity {
 		let mut max_liquidity_offset_msat = 0;
 		let mut legacy_min_liq_offset_history: Option<LegacyHistoricalBucketRangeTracker> = None;
 		let mut legacy_max_liq_offset_history: Option<LegacyHistoricalBucketRangeTracker> = None;
+		let mut sixteen_bit_min_liq_offset_history: Option<HistoricalBucketRangeTracker> = None;
+		let mut sixteen_bit_max_liq_offset_history: Option<HistoricalBucketRangeTracker> = None;
 		let mut min_liquidity_offset_history: Option<HistoricalBucketRangeTracker> = None;
 		let mut max_liquidity_offset_history: Option<HistoricalBucketRangeTracker> = None;
 		let mut last_updated = Duration::from_secs(0);
@@ -2049,21 +2062,27 @@ impl Readable for ChannelLiquidity {
 			(2, max_liquidity_offset_msat, required),
 			(3, legacy_max_liq_offset_history, option),
 			(4, last_updated, required),
-			(5, min_liquidity_offset_history, option),
-			(7, max_liquidity_offset_history, option),
+			(5, sixteen_bit_min_liq_offset_history, option),
+			(7, sixteen_bit_max_liq_offset_history, option),
 			(9, offset_history_last_updated, option),
+			(11, min_liquidity_offset_history, option),
+			(13, max_liquidity_offset_history, option),
 		});
 
 		if min_liquidity_offset_history.is_none() {
 			if let Some(legacy_buckets) = legacy_min_liq_offset_history {
 				min_liquidity_offset_history = Some(legacy_buckets.into_current());
+			} else if let Some(sixteen_bit_buckets) = sixteen_bit_min_liq_offset_history {
+				min_liquidity_offset_history = Some(sixteen_bit_buckets.normalize_from_sixteen_bits());
 			} else {
 				min_liquidity_offset_history = Some(HistoricalBucketRangeTracker::new());
 			}
 		}
 		if max_liquidity_offset_history.is_none() {
 			if let Some(legacy_buckets) = legacy_max_liq_offset_history {
 				max_liquidity_offset_history = Some(legacy_buckets.into_current());
+			} else if let Some(sixteen_bit_buckets) = sixteen_bit_max_liq_offset_history {
+				max_liquidity_offset_history = Some(sixteen_bit_buckets.normalize_from_sixteen_bits());
 			} else {
 				max_liquidity_offset_history = Some(HistoricalBucketRangeTracker::new());
 			}