Skip to content

docs: update pub-sub.md to reflect ShardedHashMap architecture#7239

Open
Copilot wants to merge 1 commit intomainfrom
copilot/update-docs-pub-sub-architecture-again
Open

docs: update pub-sub.md to reflect ShardedHashMap architecture#7239
Copilot wants to merge 1 commit intomainfrom
copilot/update-docs-pub-sub-architecture-again

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 29, 2026

Commit 95305b6 replaced the RCU-based ChannelStore (per-thread pointers + centralized ControlBlock + AwaitBrief broadcast) with a ShardedHashMap-backed design using fine-grained per-shard locking. The documentation was stale.

Changes to docs/pub-sub.md

  • Overview: Replaced RCU/thread-local-pointer description with ShardedHashMap per-shard write_mu_/read_mu_ locking model
  • Data Structures table: Removed ControlBlock, added ShardedHashMap, updated ChannelMap type from flat_hash_map to ShardedHashMap<string, UpdatablePointer, 16>
  • Subscription Management: Renamed section from "RCU" to "Shard-Locked ChannelStore"; added per-shard lock table; replaced "Two Levels of RCU" (ChannelMap-copy vs SubscribeMap-copy) with "Two Granularities of Update" (RCU pointer swap vs structural map change under exclusive read_mu_)
  • Apply() flow: Removed GetTargetMap() code block and AwaitBrief broadcast; replaced with 3-phase Mutate()-based flow (RCU swaps → structural changes → freelist cleanup via WithReadExclusiveLock)
  • FetchSubscribers: Updated from channels_->find() to channels_.FindIf() / patterns_.ForEachShared() with per-shard locking note
  • Keyspace Events: ServerState::tlocal()->channel_store() → global channel_store pointer
  • Slot Migration: ApplyAndUnsubscribe()RemoveAllSubscribers() using Mutate() + exclusive read_mu_
  • Key Files Reference: Removed ServerState channel_store_ row, added ShardedHashMap row

@mkaruza
Copy link
Copy Markdown
Contributor

mkaruza commented Apr 29, 2026

@glevkovich ai generated document changes based on current state of pubsub implementation (with sharded hash map). Can you please look to have someone as second reviewer.

@mkaruza mkaruza requested review from glevkovich and mkaruza April 29, 2026 09:07
@mkaruza mkaruza marked this pull request as ready for review April 29, 2026 09:18
Copilot AI review requested due to automatic review settings April 29, 2026 09:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates docs/pub-sub.md to match the post-95305b6 Pub/Sub subscription registry implementation, which replaced the old RCU ChannelStore design with a ShardedHashMap-backed, shard-locked model.

Changes:

  • Rewrites the Pub/Sub registry overview from centralized RCU snapshots to per-shard locking + RCU-style SubscribeMap* pointer swaps.
  • Updates the data-structure descriptions (adds ShardedHashMap, removes ControlBlock) and refreshes Apply/FetchSubscribers flows.
  • Updates ancillary sections (keyspace events, slot migration) to reflect the global channel_store and new removal paths.

Comment thread docs/pub-sub.md
Comment on lines +6 to +8
the RCU-style pointer swap used for lock-free reads on the publish path, the asynchronous
message delivery pipeline, and the backpressure system that protects the server from
slow-subscriber OOM.
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intro says the RCU pointer swap enables “lock-free reads on the publish path”, but FetchSubscribers takes a per-shard shared lock (ShardedHashMap::FindIf/ForEachShared acquire read_mu_). Consider rewording to reflect that reads are concurrent but not lock-free (shared read_mu_ + atomic UpdatablePointer::Get()).

Suggested change
the RCU-style pointer swap used for lock-free reads on the publish path, the asynchronous
message delivery pipeline, and the backpressure system that protects the server from
slow-subscriber OOM.
the RCU-style pointer swap and per-shard shared locking used for concurrent reads on the
publish path, the asynchronous message delivery pipeline, and the backpressure system that
protects the server from slow-subscriber OOM.

Copilot uses AI. Check for mistakes.
Comment thread docs/pub-sub.md
Comment on lines +48 to +49
| `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. |
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ShardedHashMap type itself is generic (default NUM_SHARDS=32), so describing it as “split into 16 shards” is only true for the ChannelStore::ChannelMap instantiation. Also, ChannelMap is defined with a transparent hash/equality (StringViewHash, std::equal_to<>) to support std::string_view lookups used by FindIf; consider reflecting that in the type/description to avoid implying the simpler 3-parameter template.

Suggested change
| `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. |
| `ShardedHashMap` | `src/core/sharded_hash_map.h` | Generic thread-safe sharded hash map template with independently locked shards (`write_mu_` and `read_mu_`); shard count depends on the template instantiation. |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16, StringViewHash, std::equal_to<>>` — maps channel/pattern names to subscriber lists across 16 independently locked shards and supports transparent `std::string_view` lookups. |

Copilot uses AI. Check for mistakes.
@augmentcode
Copy link
Copy Markdown

augmentcode Bot commented Apr 29, 2026

🤖 Augment PR Summary

Summary: Updates the Pub/Sub architecture documentation to match the current shard-locked ChannelStore implementation (post move from the prior RCU snapshot design).

Changes:

  • Replaces the old centralized-RCU/thread-local snapshot narrative with a `ShardedHashMap`-backed `ChannelStore` using per-shard `write_mu_` and `read_mu_` locks.
  • Refreshes the “Primary Data Structures” section to reflect `ShardedHashMap`, `UpdatablePointer`, and the new `ChannelMap` shape.
  • Rewrites subscription update semantics as two granularities: per-channel RCU pointer swaps vs structural insert/erase under `read_mu_` exclusive.
  • Documents the current `ChannelStoreUpdater::Apply()` behavior as a per-shard `Mutate()` workflow with a post-pass freelist cleanup under `WithReadExclusiveLock()`.
  • Updates `FetchSubscribers` routing details to use `channels_.FindIf(...)` and `patterns_.ForEachShared(...)` with per-shard locking notes.
  • Clarifies that `FetchSubscribers` is not a global snapshot since shards are locked independently.
  • Updates keyspace event publishing to use the global `channel_store` pointer.
  • Updates cluster slot migration behavior to the current “collect → remove → notify” flow using `RemoveAllSubscribers()`.

Technical Notes: The doc now emphasizes per-shard concurrency (independent locks per shard), RCU-style pointer swaps for subscriber maps, and safe reclamation via exclusive read_mu_ to drain in-flight readers.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 4 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

Comment thread docs/pub-sub.md
on the publish path, the asynchronous message delivery pipeline, and the backpressure system
that protects the server from slow-subscriber OOM.
subscription registry backed by a `ShardedHashMap` with fine-grained per-shard locking,
the RCU-style pointer swap used for lock-free reads on the publish path, the asynchronous
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/pub-sub.md:6: This mentions “lock-free reads on the publish path”, but FetchSubscribers does take read_mu_ in shared mode via ShardedHashMap::FindIf/ForEachShared. Consider rephrasing to avoid implying the publish path is fully lock-free (it’s lock-free only for the UpdatablePointer load).

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread docs/pub-sub.md
| `ChannelStore::Subscriber` | `src/server/channel_store.h` | Represents a subscribed client. Wraps `facade::ConnectionRef` plus a pattern string. |
| `ChannelStore::ControlBlock` | `src/server/channel_store.h` | Holds the `most_recent` atomic pointer and `update_mu` mutex. Prevents overlapping structural updates. |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `flat_hash_map<string, UpdatablePointer>` — maps channel/pattern names to subscriber lists. |
| `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). |
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/pub-sub.md:48: ShardedHashMap is templated on the shard count (default is 32 in sharded_hash_map.h), while ChannelStore uses 16. Consider wording this row as “N shards (ChannelStore uses 16)” to avoid implying the container is always 16-way sharded.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread docs/pub-sub.md
| `ChannelStore::ControlBlock` | `src/server/channel_store.h` | Holds the `most_recent` atomic pointer and `update_mu` mutex. Prevents overlapping structural updates. |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `flat_hash_map<string, UpdatablePointer>` — maps channel/pattern names to subscriber lists. |
| `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. |
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/pub-sub.md:49: The ChannelMap alias in code includes StringViewHash + std::equal_to<> for heterogeneous lookup (e.g. std::string_view in FindIf). Consider either spelling out the full alias or noting the transparent hash/equality requirement so the later FindIf(channel, ...) examples stay accurate.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Comment thread docs/pub-sub.md
| `ShardedHashMap` | `src/core/sharded_hash_map.h` | Thread-safe hash map split into 16 shards, each with independent `write_mu_` (Mutex) and `read_mu_` (SharedMutex). |
| `ChannelStore::ChannelMap` | `src/server/channel_store.h` | `ShardedHashMap<string, UpdatablePointer, 16>` — maps channel/pattern names to subscriber lists across 16 independently-locked shards. |
| `ChannelStore::SubscribeMap` | `src/server/channel_store.h` | `flat_hash_map<ConnectionContext*, ThreadId>` — maps subscriber contexts to their owning thread. |
| `ChannelStore::UpdatablePointer` | `src/server/channel_store.h` | Atomic wrapper around `SubscribeMap*`. Supports lock-free reads (`acquire`) and RCU-style swaps (`release`). |
Copy link
Copy Markdown

@augmentcode augmentcode Bot Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs/pub-sub.md:51: Similar to the intro, “lock-free reads” is a bit ambiguous here since readers still acquire read_mu_ shared; only the UpdatablePointer::Get() atomic load itself is lock-free. Consider tightening the wording to reflect that distinction.

Severity: low

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@glevkovich
Copy link
Copy Markdown
Contributor

@mkaruza I have 3 comments about this PR:

  1. SVG diagrams

I think this one is up to date (haven't changed)
./docs/pubsub/pubsub_asyncfiber_loop_dispatch_and_pipeline_processing.svg

Stale:
./docs/pubsub/pubsub_apply.svg
./docs/pubsub/pubsub_data_structure_layout.svg
./docs/pubsub/pubsub_data_flow_overview.svg

We need to update the SVGs or remove the images until they're updated. Those 3 are misleading after the change.

  1. nit - "lock-free reads" in Data Structures table

This entry:
hannelStore::UpdatablePointer src/server/channel_store.h Atomic wrapper around SubscribeMap*. Supports lock-free reads (acquire) and RCU-style swaps (release).

The atomic load in Get() is lock-free, but every call site holds read_mu_ shared via FindIf/ ForEachShared/ .Consider "atomic reads (acquire)" or "low-contention reads (shared read_mu_ + atomic acquire)" to avoid implying no lock is involved.

  1. nit - ChannelMap type incomplete ( in Data Structures table)

This entry:
ChannelStore::ChannelMap src/server/channel_store.h ShardedHashMap<string, UpdatablePointer, 16> — sharded map of channel/pattern names to subscriber lists.

The table shows ShardedHashMap<string, UpdatablePointer, 16> but the actual typedef is
ShardedHashMap<std::string, UpdatablePointer, 16, StringViewHash, std::equal_to<>>
heterogeneous lookup. Worth mentioning since it's what enables FindIf(string_view, ...) without allocating a std::string.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants