Skip to content

Conversation

@mostronator
Copy link
Collaborator

@mostronator mostronator commented Feb 11, 2026

Problem

When a dev fee Lightning payment times out (50s), the code blindly resets dev_fee_paid = false and clears dev_fee_payment_hash. However, a timeout does not mean the payment failed — it could still be in-flight on the Lightning Network and may succeed after the timeout window.

This creates a race condition:

  1. Payment times out → state reset to unpaid
  2. Original payment actually succeeds on LN (but local state already cleared)
  3. Next scheduler cycle picks up the order again → duplicate payment

Solution

On timeout, leave the order in PENDING state instead of resetting.

The stale PENDING cleanup (5-min TTL) serves as the safety net:

  • If the payment succeeds in the background → order stays correctly marked as paid ✅
  • If the payment never resolves → stale cleanup resets it after TTL, safe to retry ✅

Changes

  1. Remove state reset on timeout — leave dev_fee_paid=true with PENDING marker so the order is not picked up for duplicate payment
  2. Add timestamp to PENDING marker — format: PENDING-{uuid}-{unix_ts} for accurate staleness detection
  3. Replace taken_at-based stale detection — parse timestamp from the PENDING marker itself (fixes the wrong-timestamp issue from Stale PENDING Detection Uses Wrong Timestamp Field #570)
  4. Backward compatibility — legacy markers without timestamps are treated as stale
  5. 7 unit tests for the timestamp parser (new format, legacy, edge cases)

Verification

  • cargo fmt
  • cargo clippy --all-targets --all-features ✅ (0 warnings)
  • cargo test ✅ (119 passed)

Closes #568
Related: #570 (stale detection timestamp fix)

Summary by CodeRabbit

  • Bug Fixes

    • More reliable cleanup of stale pending payments using embedded timestamps; legacy pending entries are detected and reset.
    • Timeout handling now logs warnings and preserves pending state for later cleanup instead of immediately marking unpaid.
    • Improved logging to show age and legacy handling during cleanup.
  • Tests

    • Added unit tests to validate pending-timestamp parsing and related edge cases.

On timeout, the code was blindly resetting dev_fee_paid=false and
clearing dev_fee_payment_hash. However, a timeout does NOT mean the
Lightning payment failed — it could still be in-flight and may succeed
after the timeout window. This caused a race condition where a second
payment would be initiated on the next scheduler cycle.

Changes:
- Remove state reset on timeout: leave the order in PENDING state
  (dev_fee_paid=true) so it won't be picked up for a duplicate payment
- Add timestamp to PENDING marker (PENDING-{uuid}-{unix_ts}) for
  accurate staleness detection
- Replace taken_at-based stale detection with timestamp-based parsing
  from the PENDING marker itself
- Legacy markers without timestamps are treated as stale for backward
  compatibility
- 7 unit tests for the timestamp parser

The stale PENDING cleanup (5-min TTL) serves as the safety net: if the
payment never resolves, the order is reset after the TTL expires.

Closes #568
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: preventing duplicate dev fee payments when Lightning payments timeout, which is the core objective of the PR.
Linked Issues check ✅ Passed The PR implements a solution addressing #568 by avoiding state reset on timeout, using PENDING markers with embedded timestamps for staleness detection instead of querying payment status.
Out of Scope Changes check ✅ Passed All changes are directly related to the timeout handling and stale PENDING cleanup in the scheduler, with no unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/568-dev-fee-timeout-no-reset

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/scheduler.rs (1)

567-568: Consider making cleanup_ttl_secs configurable or a module-level constant.

This TTL is the key knob controlling how long an in-flight payment is given to resolve before the order is retried. Hardcoding it inside the loop body makes it easy to miss. At minimum, extract it to a const near the top of the file; ideally, source it from Settings so operators can tune it without recompilation.

Suggested change
+/// Maximum age (seconds) of a PENDING dev-fee marker before it is considered
+/// stale and eligible for retry.  Should comfortably exceed the payment timeout
+/// (currently 50 s) to avoid resetting in-flight payments.
+const DEV_FEE_PENDING_TTL_SECS: u64 = 300;
+
 async fn job_process_dev_fee_payment() {
     let pool = get_db_pool();
     let interval = 60u64;

Then inside the loop:

-            let cleanup_ttl_secs: u64 = 300; // 5 minutes
+            let cleanup_ttl_secs = DEV_FEE_PENDING_TTL_SECS;

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@grunch
Copy link
Member

grunch commented Feb 12, 2026

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 12, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/scheduler.rs (1)

517-555: ⚠️ Potential issue | 🟡 Minor

Broken doc comments: parse_pending_timestamp inherits stale docs from job_process_dev_fee_payment.

The new function was inserted in the middle of the existing doc comment block for job_process_dev_fee_payment. As a result:

  1. parse_pending_timestamp (Line 527) gets a misleading /// doc starting with "Processes unpaid development fees…" (Lines 517-519, leftover from the old function).
  2. job_process_dev_fee_payment (Line 555) is left with an orphaned doc fragment starting mid-list at "- status = 'settled-hold-invoice'…" (Line 546).

cargo doc will render both incorrectly.

Proposed fix: move the function and clean up both doc blocks

Move parse_pending_timestamp above the job_process_dev_fee_payment doc block, or fix the docs in place:

-/// Processes unpaid development fees for completed orders
-///
-/// Runs every 60 seconds and attempts to pay dev fees for orders that have:
 /// Parse the unix timestamp from a PENDING marker.
 ///
 /// Marker format: `PENDING-{uuid}-{unix_timestamp}`
 /// Legacy format: `PENDING-{uuid}` (no timestamp) → returns `None`
 ///
 /// Returns `Some(timestamp)` if a valid unix timestamp is found at the end,
 /// `None` otherwise.
 fn parse_pending_timestamp(marker: &str) -> Option<u64> {
     // ...
 }

+/// Processes unpaid development fees for completed orders.
+///
+/// Runs every 60 seconds and attempts to pay dev fees for orders that have:
 /// - status = 'settled-hold-invoice' OR 'success'
 /// - dev_fee > 0
 /// - dev_fee_paid = false
 ///
 /// Design decisions:
 /// ...
 async fn job_process_dev_fee_payment() {
🧹 Nitpick comments (2)
src/scheduler.rs (2)

564-571: Hardcoded TTL and inconsistent time source.

Two minor points:

  1. cleanup_ttl_secs is hardcoded to 300. Other timing values in this file (e.g., exp_seconds, interval) are pulled from Settings. Consider making this configurable alongside those.

  2. The new code uses std::time::SystemTime::now() (Lines 568, 692) while the rest of the file consistently uses chrono::Utc::now(). Not a correctness issue, but an inconsistency worth aligning.


836-838: Test module should be named tests per project convention.

The coding guidelines specify: "Co-locate tests in their modules under mod tests." The current name pending_marker_tests deviates from this convention.

♻️ Rename the module
 #[cfg(test)]
-mod pending_marker_tests {
+mod tests {
     use super::parse_pending_timestamp;

As per coding guidelines: "Co-locate tests in their modules under mod tests with descriptive names like handles_expired_hold_invoice."

Replace std::time::SystemTime::now() with Utc::now().timestamp() to
align with the rest of the file. Also fix doc comment placement and
rename test module.
@grunch grunch merged commit 8e66c28 into main Feb 12, 2026
6 checks passed
@grunch grunch deleted the fix/568-dev-fee-timeout-no-reset branch February 12, 2026 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Duplicate Payment Risk in Dev Fee Timeout Handling

2 participants