feat(discovery): fan out "Scan your network" nmap to connected fleet nodes#365
feat(discovery): fan out "Scan your network" nmap to connected fleet nodes#365ankitgoswami wants to merge 13 commits into
Conversation
…inline
Extend the existing "Find miners / Scan your network" flow to also discover
and pair miners behind remote fleet nodes, with no client/UI changes. The
existing client keeps calling Discover (nmap) and Pair; the server now does
the fan-out and routing transparently.
- New domain/fleetnode/discovery package owns the per-node "send command +
drain batches until ack" loop (RunOnNode) and node targeting
(ConfirmedConnectedNodeIDs). The admin DiscoverOnFleetNode handler now
delegates to it; adds control.Registry.ConnectedFleetNodeIDs().
- PairingService.Discover fans an nmap scan out to every CONFIRMED + connected
fleet node in parallel, merging all sources into one response stream
(mutex-guarded send + cross-source dedup). Zero connected nodes == unchanged
behavior; one node failing never fails the scan.
- Fleet nodes scan their own private IPv4 subnet(s) via the reserved
nmaptarget.LocalSubnetTarget ("fleetnode-local-subnet") target sentinel,
since the cloud can't know each node's network. Detection narrows masks
wider than /22, dedupes, caps at 8 subnets, and is IPv4-only.
- PairingService.Pair routes fleet-node-discovered devices to the
operator-confirmed ownership assignment (FleetNodeAssigner) instead of
refusing them, so mixed selections pair in one action. Metadata only; the
owning node dials and credentials the miner per RFC 0001.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a309f6cf26
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
🔐 Codex Security Review
Review SummaryOverall Risk: MEDIUM Findings[MEDIUM] Local-subnet fleet-node discovery accepts reports from any private address
[MEDIUM] Fleet-node fan-out launches unbounded per-node discovery work
NotesNo cryptostealing/pool-hijack, SQL injection, protobuf wire-format, or obvious auth bypass issues were found in the scoped diff. The new fan-out path is gated on both Generated by Codex Security Review | |
There was a problem hiding this comment.
Pull request overview
This PR extends the server-side “Find miners → Scan your network” flow to fan out Nmap discovery to connected + CONFIRMED fleet nodes, merge those results into the existing Discover stream, and enable inline pairing of fleet-node-discovered devices via an optional assigner interface—without any client/proto changes.
Changes:
- Introduces
domain/fleetnode/discoveryto own the per-node dispatch/drain loop (RunOnNode) and targeting (ConfirmedConnectedNodeIDs), reused by both the admin RPC and pairing handler fan-out. - Updates
PairingService.Discoverto merge cloud discovery + fleet-node local-subnet scans into one stream with serializedSend+ cross-source dedupe. - Updates
PairDevicesto route fleet-node-origin devices to aFleetNodeAssigner(wired tofleetnode/pairing.Service) instead of refusing them.
Reviewed changes
Copilot reviewed 22 out of 23 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| server/internal/testutil/infrastructure_provider.go | Updates test harness to construct pairing handler with nil discovery fan-out. |
| server/internal/handlers/pairing/handler.go | Fans out Nmap discovery to connected fleet nodes and merges results into one stream. |
| server/internal/handlers/fleetnode/admin/handler.go | Refactors admin single-node discovery to delegate to discovery service. |
| server/internal/handlers/fleetnode/admin/handler_pairing_test.go | Wires discovery service into admin handler tests. |
| server/internal/handlers/fleetnode/admin/handler_discover_test.go | Updates timeout override to new discovery package variable. |
| server/internal/domain/pairing/service.go | Adds FleetNodeAssigner and routes remote-origin devices through assignment. |
| server/internal/domain/pairing/service_pairrouting_test.go | Adds unit tests for fleet-node pairing routing and partial success. |
| server/internal/domain/pairing/mocks/mock_service.go | Regenerates mocks to include MockFleetNodeAssigner. |
| server/internal/domain/nmaptarget/nmaptarget.go | Adds reserved LocalSubnetTarget sentinel constant. |
| server/internal/domain/fleetnode/discovery/service.go | New shared discovery dispatch/drain service for fleet nodes. |
| server/internal/domain/fleetnode/discovery/service_test.go | Unit tests for RunOnNode behavior and node targeting. |
| server/internal/domain/fleetnode/discovery/reportscope.go | Moves/report-scope logic into discovery package and handles sentinel target. |
| server/internal/domain/fleetnode/discovery/reportscope_test.go | Updates report-scope tests and adds sentinel coverage. |
| server/internal/domain/fleetnode/discovery/iprange_test.go | Moves IP range tests under discovery package. |
| server/internal/domain/fleetnode/discovery/ackfailure_test.go | Moves ack-failure tests under discovery package. |
| server/internal/domain/fleetnode/control/registry.go | Adds ConnectedFleetNodeIDs() for fan-out targeting. |
| server/internal/domain/fleetnode/control/registry_test.go | Adds tests for ConnectedFleetNodeIDs(). |
| server/cmd/fleetnode/run.go | Adds test seam for local subnet detection in the agent command runner. |
| server/cmd/fleetnode/nmap.go | Handles LocalSubnetTarget by detecting local subnets and scanning those CIDRs. |
| server/cmd/fleetnode/nmap_test.go | Adds tests for LocalSubnetTarget option building and incapability handling. |
| server/cmd/fleetnode/localsubnet.go | New local subnet selection logic (filter/dedupe/cap/narrow). |
| server/cmd/fleetnode/localsubnet_test.go | Unit tests for local subnet detection behavior. |
| server/cmd/fleetd/main.go | Wires discovery service into pairing handler and admin handler; wires pairing assigner. |
Files not reviewed (1)
- server/internal/domain/pairing/mocks/mock_service.go: Language not supported
The auto_local_subnet bool was replaced by the LocalSubnetTarget sentinel; update the comments and two test names that still referenced the old field. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… Pair routing - Revert Part C: the cloud Pair routing passed discovered_device.id where fleetnode/pairing.PairDevice expects a device.id, and fleet-node discoveries have no device row to assign (the cloud can't dial them to create one, and reports carry no MAC). Restore the safe refusal; inline pairing of fleet-node miners needs the node-side device-creation flow (follow-up). - Gate the fleet-node fan-out to the automatic "Scan your network" action (nmap target == the cloud's own local subnet) so a manual/explicit nmap scan no longer also sweeps every connected node's LAN. Adds Service.IsLocalSubnetScan. - Cap the agent's local-subnet auto-scan to a total host budget (discoverylimits.MaxScanTargets) across all detected subnets, not just /22 per subnet, so a multi-homed node can't 8x-oversize one command. - Map caller cancellation to CodeCanceled in RunOnNode and in Discover (was Internal / silent nil), and suppress the fan-out WARN when the stream is already cancelled (expected operator disconnect). - Fix the LocalSubnetTarget comment: IPv4 private space is RFC1918, not RFC4193. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the agent's bespoke per-NIC private-subnet enumeration with networking.GetLocalNetworkInfo() — the same primary-interface detection the cloud/combined Discover path already uses, so the node scans the same way combined mode does. Intentionally less robust (picks one interface, no virtual-NIC filtering, no /22 narrowing or total-host cap); parity is enough for now and hardening is a follow-up. Removes selectLocalPrivateSubnets and its tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After the combined-mode parity change, localsubnet.go held only errNoLocalSubnet and a small detectLocalSubnets wrapper used solely by buildNmapOptions. Move both next to their caller in nmap.go and drop the now-empty file. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… dedup key, tests) From the multi-agent code review (no correctness defects found): - Cap fan-out concurrency: a maxConcurrentFleetNodeScans (32) semaphore bounds in-flight per-node ControlStream commands so a large fleet can't spawn an unbounded number of slots; goroutines acquire via a streamCtx-aware select so they exit on operator disconnect. - Extract pairing.DeviceDedupKey and use it in both dedupeDiscoverResponses and the Discover handler's cross-source send closure, so the dedup identity stays in lockstep instead of being re-derived. - Add service-level tests: RunOnNode treats an onBatch error as terminal, RunOnNode returns CodeDeadlineExceeded when the agent never acks, and ConfirmedConnectedNodeIDs propagates a ListFleetNodes error. - Minor comment polish (detectLocalSubnets doc, DiscoverOnFleetNode godoc). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses the residual findings from the multi-agent review: - #4/#7: DiscoverWithNmap now reports whether the target is the cloud's own local subnet (resolveNmapTargets already computed it); the Discover handler gates fan-out on that return value instead of a second GetLocalNetworkInfo call. Removes IsLocalSubnetScan and the duplicate ip-route fork. - #1: extract the mutex-guarded dedup/serialize send into dedupForwarder and unit-test it (cross-source dedup, ip:port fallback, all-duplicate drop, send-error cancel, concurrency under -race). - #3: bound the fan-out with fleetNodeFanOutTimeout (5m) so one wedged node can't extend the operator's wait to the full per-node 12m budget. - #8: discovery.Service depends on a narrow nodeRegistry interface instead of *control.Registry (mirrors the nodeLister idiom; eases testing). - #13: control.Registry.mu is now sync.RWMutex; ConnectedFleetNodeIDs and ReportScopeFor take read locks. - #12: test that the LocalSubnetTarget sentinel dispatches through RunOnNode (the shared single-node + fan-out path). Plus service-level tests for the new DiscoverWithNmap/resolveNmapTargets bool. No behavior change for cloud-only discovery or existing single-node discovery. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Compress the fan-out comments and const docs to the load-bearing "why" — no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Peel back machinery added reactively to low-priority review findings, none of which earns its complexity at the actual fleet scale: - Remove fleetNodeFanOutTimeout: it was a third timeout layer (atop the agent's 10m and RunOnNode's 12m) with an arbitrary value that second-guessed the agent's own budget. Discover streams incrementally, so a wedged node only delays the stream close, not results. RunOnNode's per-node timeout is the single, sufficient bound. - Remove the maxConcurrentFleetNodeScans semaphore: inert at typical fleet sizes (tens of nodes); the per-node timeout already bounds each node. - Revert control.Registry to sync.Mutex: the RWMutex was a P3 micro-opt for contention that doesn't exist at this scale and added lock-discipline risk. No behavior change for the realistic path; fan-out goroutines run on streamCtx and remain bounded by RunOnNode's per-node timeout. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ConnectedFleetNodeIDs iterates a map (order unspecified), so assert the ConfirmedConnectedNodeIDs result with ElementsMatch rather than Equal to match the contract and avoid future flakiness. Also tighten the RunOnNode handleEvent comment to the non-obvious terminal-with-nil-err semantic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PairingService.Discover fan-out enumerated all confirmed connected fleet nodes and issued discovery commands over their control streams while gated only by miner:pair. The single-node DiscoverOnFleetNode path requires fleetnode:manage, so this was a weaker authorization path to the same fleet-node command surface. Fan-out now also requires fleetnode:manage via a non-failing check, so miner:pair-only callers still get cloud-only discovery but cannot drive discovery commands on fleet nodes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ada7392f4f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The local-subnet sentinel passed detectLocalSubnets() output straight to nmap. That detection reuses primary-interface logic that doesn't filter for RFC1918, so a node whose primary NIC is public could have that subnet scanned by an automatic "Scan your network". The server drops non-private reports, but only after the node has already sent the scan traffic. Validate each detected prefix is a private IPv4 subnet before scanning and fail AGENT_INCAPABLE otherwise, so a fan-out skips such a node. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35dc9be7d6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
The local-subnet sentinel validated detected subnets were private but not their breadth, so a node whose primary NIC has a prefix broader than the /22 nmap limit (e.g. 10.0.0.0/16, returned directly from the OS interface mask) could be made to sweep tens of thousands of hosts per fan-out, defeating the cap that applies to operator-supplied targets. validateLocalSubnetTarget now also rejects prefixes broader than nmaptarget.MinIPv4PrefixBits (AGENT_INCAPABLE), mirroring nmaptarget.Validate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c87b702258
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Summary
Extends the existing Find miners → Scan your network action so the server, in addition to its own scan, fans an nmap sweep out to every CONFIRMED + connected fleet node (each scans its own local subnet and reports back), merging all results into the existing
Discoverstream — no client/proto changes.What changed (server-side)
domain/fleetnode/discovery(new) — shared per-node dispatch/drain loop (RunOnNode) and targeting (ConfirmedConnectedNodeIDs= CONFIRMED ∩ connected). The adminDiscoverOnFleetNodehandler delegates to it. Addscontrol.Registry.ConnectedFleetNodeIDs().Service.IsLocalSubnetScan, the same signal that drives known-subnet expansion), i.e. the "Scan your network" button. A manual/explicit nmap target stays cloud-only. Sources merge into one stream via a serialized, cross-source-deduped send; zero connected nodes ⇒ unchanged behavior; one node failing never fails the scan.nmaptarget.LocalSubnetTargetsentinel. Detection filters virtual/loopback interfaces, narrows masks wider than /22, and caps the total swept addresses across all detected subnets atdiscoverylimits.MaxScanTargets(not just per-subnet).CodeCanceled/CodeDeadlineExceeded(not Internal or a silent nil success); fan-out node failures stay quiet on operator disconnect.Testing
go build ./...,go vet ./...,golangci-lint(full server): clean.RunOnNode(forward/partial/disconnect/cancel),ConfirmedConnectedNodeIDs,ConnectedFleetNodeIDs,IsLocalSubnetScan, auto report-scope/normalize,buildNmapOptionssentinel path.🤖 Generated with Claude Code