fix: subscribe read nodes to MEMPOOL_TOPIC#866
Conversation
When PR #349 enabled submit_message on read nodes, only the RPC-layer guard was removed. The gossip-layer subscription for MEMPOOL_TOPIC was left as a validator-only branch, so read nodes never joined the mempool mesh: client submissions accepted via RPC fell back to libp2p-gossipsub fanout (best-effort, TTL'd, not a useful relay), and inbound mempool gossip from peers was dropped wholesale. Hoist the MEMPOOL_TOPIC subscription out of the validator-only branch so both node types participate in the mempool mesh, and add a regression test that a read node receives mempool messages published by a validator peer. Closes #865 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a network gossip regression where read nodes did not subscribe to MEMPOOL_TOPIC, preventing them from joining the mempool gossipsub mesh and receiving inbound mempool gossip (impacting submit-to-read-node propagation).
Changes:
- Subscribe all node types (validators and read nodes) to
MEMPOOL_TOPICduring gossip initialization. - Add a regression test ensuring a read node receives mempool gossip published by a validator peer.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/network/gossip.rs |
Hoists MEMPOOL_TOPIC subscription out of the validator-only branch so read nodes also subscribe and receive mempool gossip. |
src/network/gossip_tests.rs |
Adds a regression test verifying a read node delivers mempool gossip to its system channel. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Diff CoverageDiff: origin/main...HEAD, staged and unstaged changes
Summary
src/network/gossip.rs |
Replace the fixed-1s-sleep + single-publish pattern with a publisher loop and a poll-on-receive loop bounded by a deadline. Until the gossipsub mesh forms, `publish` can fail with InsufficientPeers and is silently logged-and-dropped, so a single fire-and-forget publish was fragile. Repeat sends are safe because gossipsub message-id dedup ensures the read node receives the message at most once. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address review feedback on the regression test (warn-log wording, break publisher loop on send-channel close), then close two observability gaps the original test didn't cover: 1. Multi-hop mesh participation — `test_read_node_relays_mempool_gossip` wires three nodes (read1 ↔ read2 ↔ validator) with no direct read1↔validator bootstrap and autodiscovery left off. For the cast to land at the validator, gossipsub has to forward via read2's mesh. Distinguishes mesh from fanout: in the bug scenario read2 isn't subscribed, read1 has no direct subscribed peer, fanout has nothing to fall back to, and the publish never leaves read1. 2. End-to-end RPC ingress — `test_read_node_relays_rpc_submission_to_validators` submits a cast through a read node's mempool with `MempoolSource::RPC` (the same path `MyHubService::submit_message` uses for client gRPC) and asserts the cast reaches a validator's committed state via the gossip leg. Two test-infra fixes were necessary along the way: - `ReadNodeForTest::create` now spins up a `ReadNodeMempool`, exposes a `submit_message_via_mempool` helper, and runs gossip with `read_node=true` (it was running as a validator, which masked the bug we're trying to test). - `NodeForTest`'s system-message router now forwards `SystemMessage::Mempool` into the local mempool — matching `main.rs`. Without it, gossip-received mempool messages were dropped on the validator receive side, so mempool gossip propagation was effectively untested in the consensus suite (casts were reaching peers via consensus, not gossip). Refactor: factor a `publish_until_received` helper out of the 2-node regression test and reuse it for the new 3-node test — both retry publishing in a tight loop until the gossipsub mesh is established or a deadline elapses, and break the publisher loop early when the gossip channel closes (so a dead gossip task fails fast instead of burning the full deadline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
MEMPOOL_TOPIC, so they couldn't join the mempool gossip mesh after chore: enable submit to read nodes #349 enabledsubmit_messageon them. This made every client submission apublishover the topic-fanout fallback (best-effort, TTL'd, lossy under poor connectivity), and dropped all inbound mempool gossip wholesale.MEMPOOL_TOPICsubscription out of the validator-only branch so both node types subscribe to it.test_read_node_receives_mempool_gossip) that asserts a read node delivers mempool messages published by a validator peer to its system channel.Test plan
cargo check --testspassescargo test --lib network::gossip_testspasses (3/3, including the new regression test)🤖 Generated with Claude Code