Skip to content

fix: NULL RWDB and mutex locks#728

Draft
shortthefomo wants to merge 18 commits intoXahau:devfrom
shortthefomo:RWDB-mutex
Draft

fix: NULL RWDB and mutex locks#728
shortthefomo wants to merge 18 commits intoXahau:devfrom
shortthefomo:RWDB-mutex

Conversation

@shortthefomo
Copy link
Copy Markdown
Contributor

@shortthefomo shortthefomo commented Apr 11, 2026

back-port Mutex fixes that cause locks and cause transitions XRPLF/rippled#6549

High Level Overview of Change

Adds a null-backend mode to the RWDB in-memory node store. When enabled (XAHAU_RWDB_NULL=1 or null_backend=1 in [node_db]), fetch() always returns notFound and store() is a no-op. State is retained entirely through Ledger → SHAMap shared_ptr chains, with a sliding window of recent ledgers kept resident in memory.

This eliminates all node-store I/O and map overhead while keeping the node able to validate, build ledgers, and serve peers.

What Changed

  • RWDBFactory: null-mode short-circuits in fetch()/store(); TOCTOU race fixed (isOpen_ check moved inside lock)
  • DatabaseRotatingImp: std::mutexreader_preferring_shared_mutex (shared_lock for reads, unique_lock for writes); removed unused copyArchiveTo()
  • SHAMapStoreImp: null-mode rotation skips the state-tree walk (pure bookkeeping); null_backend settable via config; online_delete auto-defaults to ledger_history
  • SHAMapSync: shared FullBelowCache disabled in null mode to prevent subtree-skipping across SHAMaps; local isFullBelow() checks no longer gated by the cache flag
  • LedgerMaster: mRetainedLedgers sliding window pins N recent ledgers; getClosestFullyWiredLedger() for efficient delta-walk base selection
  • InboundLedger: primeInboundLedgerForUse() wires ledgers via delta walk against closest base, or trusts sync pinning when no base exists (avoids 70M+ leaf walk)
  • InboundLedgers: recentHistoryLedgers_ cache; onLedgerFetched() now passes the inbound ledger for retention
  • PeerImp: fallback to TreeNodeCache and in-memory ledger lookup when node store returns empty
  • Ledger: isFullyWired()/setFullyWired() tracking; fullWireForUse() helper
  • Config: Config::null_backend() consolidates env var check; removed early online_delete requirement

API Impact

  • Public API changes
  • libxrpl change
  • Peer protocol change

Config

Standard RWDB mode

[node_db]
type=rwdb
online_delete=256
advisory_delete=0

[relational_db]
backend=rwdb

Null-backend mode

Option A — environment variable:

XAHAU_RWDB_NULL=1 ./xahaud --conf xahaud.cfg

Option B — config file:

[node_db]
type=rwdb
null_backend=1
advisory_delete=0

[relational_db]
backend=rwdb

ledger_history must be > 0. online_delete is auto-defaulted to ledger_history if not set.

RichardAH and others added 7 commits February 24, 2026 16:07
* Add AMM bid/create/deposit/swap/withdraw/vote invariants:
  - Deposit, Withdrawal invariants: `sqrt(asset1Balance * asset2Balance) >= LPTokens`.
  - Bid: `sqrt(asset1Balance * asset2Balance) > LPTokens` and the pool balances don't change.
  - Create: `sqrt(asset1Balance * assetBalance2) == LPTokens`.
  - Swap: `asset1BalanceAfter * asset2BalanceAfter >= asset1BalanceBefore * asset2BalanceBefore`
     and `LPTokens` don't change.
  - Vote: `LPTokens` and pool balances don't change.
  - All AMM and swap transactions: amounts and tokens are greater than zero, except on withdrawal if all tokens
    are withdrawn.
* Add AMM deposit and withdraw rounding to ensure AMM invariant:
  - On deposit, tokens out are rounded downward and deposit amount is rounded upward.
  - On withdrawal, tokens in are rounded upward and withdrawal amount is rounded downward.
* Add Order Book Offer invariant to verify consumed amounts. Consumed amounts are less than the offer.
* Fix Bid validation. `AuthAccount` can't have duplicate accounts or the submitter account.
Due to rounding, the LPTokenBalance of the last LP might not match the LP's trustline balance. This was fixed for `AMMWithdraw` in `fixAMMv1_1` by adjusting the LPTokenBalance to be the same as the trustline balance. Since `AMMClawback` is also performing a withdrawal, we need to adjust LPTokenBalance as well in `AMMClawback.`

This change includes:
1. Refactored `verifyAndAdjustLPTokenBalance` function in `AMMUtils`, which both`AMMWithdraw` and `AMMClawback` call to adjust LPTokenBalance.
2. Added the unit test `testLastHolderLPTokenBalance` to test the scenario.
3. Modify the existing unit tests for `fixAMMClawbackRounding`.
@shortthefomo shortthefomo changed the title Rwdb mutex fix: RWDB mutex locks Apr 11, 2026
@sublimator
Copy link
Copy Markdown
Collaborator

@shortthefomo

Thanks!

We had a PR that touched this area and I vaguely recall fixing /some/ issues
#548

I will look at this next week and also check if there's some overlap, but either way, given this is surely smaller, will try and prioritize it for you

@sublimator
Copy link
Copy Markdown
Collaborator

Mutex changes aside, which seem sensible and will look at next Monday, some other thoughts

How does RWDB perform on ripple, with a giant tree? I suppose that was where things were stressed enough to expose these inefficiencies?

I do wonder if you could somehow make the rotation more tree aware, and simply use one chain of structurally shared trees with garbage collection? The SHAMap leaves reference map items themselves, allocated in their own arena, so to keep them in an in-memory node store too seems somewhat redundant, especially when you consider the delta between the very latest tree once synced and back even 2048 ledgers is going to be quite small relative to the whole, ditto for the rotation!

You could potentially simply keep a long chain of Ledger/SHAMap objects in memory, pruning when the ledger range goes out of range?

iirc, some p2p handlers reference the nodestore

@shortthefomo
Copy link
Copy Markdown
Contributor Author

I actually tried that because I really hate this copy bollox... just prune the tree... but I ended up chasing my tail more than I care to share.

so I went for next, simply fix the mutex. which still required changes to the copy, lol!

I really dislike that conception with a passion. but meh, we get a stable rotation with not much fusss.... is a middle ground and I think they be happy with it as well. But yes I also pref that one day a direct prune of the heap is done.

@sublimator
Copy link
Copy Markdown
Collaborator

but meh, we get a stable rotation with not much fusss

Yeah, I hear ya!

@sublimator
Copy link
Copy Markdown
Collaborator

sublimator commented Apr 13, 2026

@shortthefomo

Actually, there DOES seem to be a way you can avoid the copy "bollox", and route all requests (peers etc) to the in-memory caches, meaning you can get away with a NULL backend (no rotating backend at all really needed, just the rwdb relational database)

As the inbound ledgers come in, if you walk the trees and link them up properly (inner nodes have children pointers lazily linked on first walk/use otherwise), a Ledger -> SHAMap -> root node -> nested inners -> child node
retention chain will keep the nodes you want in memory, and you simply need to keep the last online_delete|ledger_history specified number of Ledger objects resident (e.g. in LedgerMaster). There /may/ even be ways to optimize it to avoid the walks.

It's sitting slightly under 1400MiB for Xahau after ~23m, where I think normally, iirc, it sits around 4GiB.

It's still a mess, bla/disclaimer et,c and I don't really have time to develop/test it further but I pushed it if you want to "compare notes" or the like:
https://github.com/Xahau/xahaud/tree/null-rdwb-experiment

edit: latest run is using more memory, might have vibe coded a regression, but I'm sure you get the general direction/possibility

@sublimator
Copy link
Copy Markdown
Collaborator

@shortthefomo

Required
clang-format / check (pull_request)
clang-format / check (pull_request)Failing after 38s

Copy link
Copy Markdown
Collaborator

@sublimator sublimator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some checks seem to be failing

sublimator and others added 5 commits April 13, 2026 08:56
Introduces a 'NULL' node-store mode (via XAHAU_RWDB_NULL) that operates
entirely in-memory by leveraging a sliding window of retained Ledger objects.

Key changes:
- SHAMapSync: Bypass FullBelowCache in null mode to force full tree wiring.
- Ledger: Add 'fullyWired' state tracking and mandatory wiring before use.
- LedgerMaster: Implement 'mRetainedLedgers' sliding window to pin SHAMap graphs.
- PeerImp: Add fallbacks to TreeNodeCache and LedgerMaster for peer requests.
- contract: Add boost::stacktrace to LogThrow for easier debugging of misses.
- basics: Add ReaderPreferringSharedMutex to mitigate reader starvation.
- Search both LedgerMaster and InboundLedgers for the closest fully wired base.
- Implement sameChainDistance helper to accurately calculate distance between ledgers on the same chain.
- Use findBestFullyWiredBase to minimize the 'prime walk' delta.
@shortthefomo
Copy link
Copy Markdown
Contributor Author

okay let me have a play with that branch as well, as is something I would prefer as well skipping this rotation.
will also fix clan and checks here.

@sublimator
Copy link
Copy Markdown
Collaborator

sublimator commented Apr 13, 2026 via email

@shortthefomo
Copy link
Copy Markdown
Contributor Author

shortthefomo commented Apr 13, 2026

Nice work!

Seems to be running fine so far here by me https://github.com/shortthefomo/xahaud/tree/null-rdwb-experiment some minor adjustments. Also ported this into the branch im testing on XRPL https://github.com/shortthefomo/rippled/tree/null-rdwb-experiment both seem to be good so far.

*edited: getting state rotations on the x86 hardware so there still some long running lock vs the previous code but will find it.

- Replace std::mutex with reader_preferring_shared_mutex in
  DatabaseRotatingImp (shared_lock for reads, unique_lock for writes)
- Skip expensive full state tree walk when no base ledger exists in
  primeInboundLedgerForUse — trust sync pinning, just wire tx map
- Allow null_backend to be set via [node_db] config section
- Remove early RWDB online_delete requirement from Config (now defaulted
  by SHAMapStoreImp)
- Fix SHAMapSync: only gate shared FullBelowCache operations behind
  useFullBelowCache(), not local per-node isFullBelow() checks
- Update Config_test for removed RWDB online_delete requirement
The null-mode rotation path was calling clearLedgerCachePrior() directly
instead of clearCaches(), which also clears the FullBelowCache. Stale
'full below' markers from a previous sync pass persisted across rotation,
causing SHAMap sync to skip subtrees that actually need re-fetching,
leading to the node oscillating between 'full' and 'syncing' states.
@shortthefomo shortthefomo changed the title fix: RWDB mutex locks fix: NULL RWDB and mutex locks Apr 14, 2026
@sublimator
Copy link
Copy Markdown
Collaborator

Yeah, the full below cache needs some thought/work, and you might be able to do the node linking as it happens as the tree is built, rather than after the fact. And once you go that far, you might even want to simply not even create the NodeObjects for flushing in that mode, cause it's pointless heap churn.

If you can get the null mode working properly, is there even any need to have it rotating? Maybe all you really need is a direct null backend ? And the rwdb relational db (sqlite) impl ?

@sublimator
Copy link
Copy Markdown
Collaborator

Run set -o pipefail
diff --git a/Builds/levelization/results/ordering.txt b/Builds/levelization/results/ordering.txt
index b10a625..b823540 100644
--- a/Builds/levelization/results/ordering.txt
+++ b/Builds/levelization/results/ordering.txt
@@ -183,6 +183,7 @@ xrpld.overlay > xrpl.basics
 xrpld.overlay > xrpld.core
 xrpld.overlay > xrpld.peerfinder
 xrpld.overlay > xrpld.perflog
+xrpld.overlay > xrpld.shamap
 xrpld.overlay > xrpl.json
 xrpld.overlay > xrpl.protocol
 xrpld.overlay > xrpl.resource

Hrmmm, this is because PeerImp directly consults the SHAMap now in that prototype code?

@shortthefomo
Copy link
Copy Markdown
Contributor Author

Yeah ive stumbled across a few issues my self esp in the larger DB version. Will mark this rather as draft and leave a bit more time. I chased my tail a bit before in this none rotation version(s).

apologies.

@shortthefomo shortthefomo marked this pull request as draft April 14, 2026 05:00
@sublimator
Copy link
Copy Markdown
Collaborator

apologies

No need :) !

@sublimator
Copy link
Copy Markdown
Collaborator

@shortthefomo

I pushed to the branch before with more lazy linking without the walk, and reenabling the FBC, consulting the TreeNodeCache for liveness, which keeps weak references to any retention chain resident nodes to make sure its authoritative. I also patched it to work with type=none backend to support using the existing NullFactory.

Without the walk requirement it will probably perform better for large maps like rippled.

See here for reasoning:
https://gist.github.com/sublimator/6da8c771c99e1a2446bb6b70aab571c2

Not much testing yet but I'm not getting any of the immediate linking related crashes that were the impetus for the ham fisted walks.

I believe we should introduce some TreeNodeCodec application level service that can encode into the wire format for peers directly from canonical shamap nodes?

@sublimator
Copy link
Copy Markdown
Collaborator

sublimator commented Apr 14, 2026

Another observation, currently the mRetainedLedgers isn't really matching mCompleteLedgers in LedgerMaster. Which leads to questioning whether rotation even really makes sense in this memory resident mode vs growing to some bound and pruning out of bound tails after each ledger, rather than bulk operations batched to some big 'rotation' ?

edit: I implemented that, and depending on the current txn throughput (xahau seems to be quite various), it was sitting only only around 850MiB usage, by keeping only the last 16 ledgers in memory

@shortthefomo
Copy link
Copy Markdown
Contributor Author

shortthefomo commented Apr 15, 2026

@sublimator any idea to this?

// xahaud (line 121)
{SizedItem::burstSize, {{ 4, 8, 16, 32, 64*1024*1024 }}}  // 64 TB for huge!

// rippled
{SizedItem::burstSize, {{ 4, 8, 16, 32, 48 }}}            // 48 MB for huge

seems to have tripped me up. (typically im working off xrpld as its got most of the issues present in the main net vs xahau) basically everything larger and breaks more. I then work to stable there and bring back changes here, but then also trip over some changes. eg, this one as its something I counted on but now is different here. Dang this two ledger walk is difficult.

also looks like that could possibly have meant to be a change to 64MB?

@sublimator
Copy link
Copy Markdown
Collaborator

What am I looking at ?
Configurations for burstSize for various node sizes ?
And you'd prefer they were aligned with rippled?
You could check the git history and see why/where it diverged.
I don't know when/why myself

That's a radical difference though huh !

Dang this two ledger walk is difficult.

Yeah, I hear ya

@shortthefomo
Copy link
Copy Markdown
Contributor Author

config.cpp

inline constexpr std::array<std::pair<SizedItem, std::array<int, 5>>, 13>
sizedItems
{{
    // FIXME: We should document each of these items, explaining exactly
    //        what they control and whether there exists an explicit
    //        config option that can be used to override the default.

    //                                   tiny    small   medium    large     huge
    {SizedItem::sweepInterval,      {{     10,      30,      60,      90,     120 }}},
    {SizedItem::treeCacheSize,      {{ 262144,  524288, 2097152, 4194304, 8388608 }}},
    {SizedItem::treeCacheAge,       {{     30,      60,      90,     120,     900 }}},
    {SizedItem::ledgerSize,         {{     32,      32,      64,     256,     384 }}},
    {SizedItem::ledgerAge,          {{     30,      60,     180,     300,     600 }}},
    {SizedItem::ledgerFetch,        {{      2,       3,       4,       5,       8 }}},
    {SizedItem::hashNodeDBCache,    {{      4,      12,      24,      64,     128 }}},
    {SizedItem::txnDBCache,         {{      4,      12,      24,      64,     128 }}},
    {SizedItem::lgrDBCache,         {{      4,       8,      16,      32,     128 }}},
    {SizedItem::openFinalLimit,     {{      8,      16,      32,      64,     128 }}},
    {SizedItem::burstSize,          {{      4,       8,      16,      32,      64*1024*1024 }}},
    {SizedItem::ramSizeGB,          {{      8,      12,      16,      24,      32 }}},
    {SizedItem::accountIdCacheSize, {{  20047,   50053,   77081,  150061,  300007 }}}
}};

@shortthefomo
Copy link
Copy Markdown
Contributor Author

shortthefomo commented Apr 15, 2026

yeah seems comments say 64MB on commit but its 64TB a15d0b2

The value is wrapped in megabytes

megabytes(app_.config().getValueFor(SizedItem::burstSize, std::nullopt))
// from ByteUtilities.h
megabytes(T value) { return value * 1024 * 1024; }

@sublimator
Copy link
Copy Markdown
Collaborator

Yeah, seems a typo/brainfart bug

@shortthefomo
Copy link
Copy Markdown
Contributor Author

okay so moving that back to

    {SizedItem::burstSize,          {{      4,       8,      16,      32,      64*1024 }}},

gets me back to where expected, im going to look to figure why, what where that change came from. Simply because it's tripping me up. Now I have to back that out to get a response I expected.

*also im be out traveling for +- month, these PR prob run past that. So just understand the stop (for now).

@sublimator
Copy link
Copy Markdown
Collaborator

Enjoy your trip !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants