docs: update pub-sub.md to reflect ShardedHashMap architecture#7239
docs: update pub-sub.md to reflect ShardedHashMap architecture#7239
Conversation
…ommit 95305b6 Agent-Logs-Url: https://github.com/dragonflydb/dragonfly/sessions/74ba6b9a-053c-4612-9d80-147ec708df2c Co-authored-by: mkaruza <3676457+mkaruza@users.noreply.github.com>
|
@glevkovich ai generated document changes based on current state of pubsub implementation (with sharded hash map). Can you please look to have someone as second reviewer. |
There was a problem hiding this comment.
Pull request overview
Updates docs/pub-sub.md to match the post-95305b6 Pub/Sub subscription registry implementation, which replaced the old RCU ChannelStore design with a ShardedHashMap-backed, shard-locked model.
Changes:
- Rewrites the Pub/Sub registry overview from centralized RCU snapshots to per-shard locking + RCU-style
SubscribeMap*pointer swaps. - Updates the data-structure descriptions (adds
ShardedHashMap, removesControlBlock) and refreshes Apply/FetchSubscribers flows. - Updates ancillary sections (keyspace events, slot migration) to reflect the global
channel_storeand new removal paths.
| the RCU-style pointer swap used for lock-free reads on the publish path, the asynchronous | ||
| message delivery pipeline, and the backpressure system that protects the server from | ||
| slow-subscriber OOM. |
There was a problem hiding this comment.
The intro says the RCU pointer swap enables “lock-free reads on the publish path”, but FetchSubscribers takes a per-shard shared lock (ShardedHashMap::FindIf/ForEachShared acquire read_mu_). Consider rewording to reflect that reads are concurrent but not lock-free (shared read_mu_ + atomic UpdatablePointer::Get()).
| the RCU-style pointer swap used for lock-free reads on the publish path, the asynchronous | |
| message delivery pipeline, and the backpressure system that protects the server from | |
| slow-subscriber OOM. | |
| the RCU-style pointer swap and per-shard shared locking used for concurrent reads on the | |
| publish path, the asynchronous message delivery pipeline, and the backpressure system that | |
| protects the server from slow-subscriber OOM. |
| | `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). | | ||
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. | |
There was a problem hiding this comment.
The ShardedHashMap type itself is generic (default NUM_SHARDS=32), so describing it as “split into 16 shards” is only true for the ChannelStore::ChannelMap instantiation. Also, ChannelMap is defined with a transparent hash/equality (StringViewHash, std::equal_to<>) to support std::string_view lookups used by FindIf; consider reflecting that in the type/description to avoid implying the simpler 3-parameter template.
| | `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). | | |
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. | | |
| | `ShardedHashMap` | `src/core/sharded_hash_map.h` | Generic thread-safe sharded hash map template with independently locked shards (`write_mu_` and `read_mu_`); shard count depends on the template instantiation. | | |
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16, StringViewHash, std::equal_to<>>` — maps channel/pattern names to subscriber lists across 16 independently locked shards and supports transparent `std::string_view` lookups. | |
🤖 Augment PR SummarySummary: Updates the Pub/Sub architecture documentation to match the current shard-locked Changes:
Technical Notes: The doc now emphasizes per-shard concurrency (independent locks per shard), RCU-style pointer swaps for subscriber maps, and safe reclamation via exclusive 🤖 Was this summary useful? React with 👍 or 👎 |
| on the publish path, the asynchronous message delivery pipeline, and the backpressure system | ||
| that protects the server from slow-subscriber OOM. | ||
| subscription registry backed by a `ShardedHashMap` with fine-grained per-shard locking, | ||
| the RCU-style pointer swap used for lock-free reads on the publish path, the asynchronous |
There was a problem hiding this comment.
docs/pub-sub.md:6: This mentions “lock-free reads on the publish path”, but FetchSubscribers does take read_mu_ in shared mode via ShardedHashMap::FindIf/ForEachShared. Consider rephrasing to avoid implying the publish path is fully lock-free (it’s lock-free only for the UpdatablePointer load).
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| | `ChannelStore::Subscriber` | `src/server/channel_store.h` | Represents a subscribed client. Wraps `facade::ConnectionRef` plus a pattern string. | | ||
| | `ChannelStore::ControlBlock` | `src/server/channel_store.h` | Holds the `most_recent` atomic pointer and `update_mu` mutex. Prevents overlapping structural updates. | | ||
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `flat_hash_map<string, UpdatablePointer>` — maps channel/pattern names to subscriber lists. | | ||
| | `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). | |
There was a problem hiding this comment.
docs/pub-sub.md:48: ShardedHashMap is templated on the shard count (default is 32 in sharded_hash_map.h), while ChannelStore uses 16. Consider wording this row as “N shards (ChannelStore uses 16)” to avoid implying the container is always 16-way sharded.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| | `ChannelStore::ControlBlock` | `src/server/channel_store.h` | Holds the `most_recent` atomic pointer and `update_mu` mutex. Prevents overlapping structural updates. | | ||
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `flat_hash_map<string, UpdatablePointer>` — maps channel/pattern names to subscriber lists. | | ||
| | `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). | | ||
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. | |
There was a problem hiding this comment.
docs/pub-sub.md:49: The ChannelMap alias in code includes StringViewHash + std::equal_to<> for heterogeneous lookup (e.g. std::string_view in FindIf). Consider either spelling out the full alias or noting the transparent hash/equality requirement so the later FindIf(channel, ...) examples stay accurate.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| | `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). | | ||
| | `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. | | ||
| | `ChannelStore::SubscribeMap` | `src/server/channel_store.h` | `flat_hash_map<ConnectionContext*, ThreadId>` — maps subscriber contexts to their owning thread. | | ||
| | `ChannelStore::UpdatablePointer` | `src/server/channel_store.h` | Atomic wrapper around `SubscribeMap*`. Supports lock-free reads (`acquire`) and RCU-style swaps (`release`). | |
There was a problem hiding this comment.
docs/pub-sub.md:51: Similar to the intro, “lock-free reads” is a bit ambiguous here since readers still acquire read_mu_ shared; only the UpdatablePointer::Get() atomic load itself is lock-free. Consider tightening the wording to reflect that distinction.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
@mkaruza I have 3 comments about this PR:
I think this one is up to date (haven't changed) Stale: We need to update the SVGs or remove the images until they're updated. Those 3 are misleading after the change.
This entry: The atomic load in Get() is lock-free, but every call site holds read_mu_ shared via FindIf/ ForEachShared/ .Consider "atomic reads (acquire)" or "low-contention reads (shared read_mu_ + atomic acquire)" to avoid implying no lock is involved.
This entry: The table shows ShardedHashMap<string, UpdatablePointer, 16> but the actual typedef is |
Commit 95305b6 replaced the RCU-based
ChannelStore(per-thread pointers + centralizedControlBlock+AwaitBriefbroadcast) with aShardedHashMap-backed design using fine-grained per-shard locking. The documentation was stale.Changes to
docs/pub-sub.mdShardedHashMapper-shardwrite_mu_/read_mu_locking modelControlBlock, addedShardedHashMap, updatedChannelMaptype fromflat_hash_maptoShardedHashMap<string, UpdatablePointer, 16>read_mu_)GetTargetMap()code block andAwaitBriefbroadcast; replaced with 3-phaseMutate()-based flow (RCU swaps → structural changes → freelist cleanup viaWithReadExclusiveLock)channels_->find()tochannels_.FindIf()/patterns_.ForEachShared()with per-shard locking noteServerState::tlocal()->channel_store()→ globalchannel_storepointerApplyAndUnsubscribe()→RemoveAllSubscribers()usingMutate()+ exclusiveread_mu_ServerState channel_store_row, addedShardedHashMaprow