Skip to content

feat(discovery): fan out "Scan your network" nmap to connected fleet nodes#365

Open
ankitgoswami wants to merge 13 commits into
mainfrom
ankitg/discovery-pairing-fanout
Open

feat(discovery): fan out "Scan your network" nmap to connected fleet nodes#365
ankitgoswami wants to merge 13 commits into
mainfrom
ankitg/discovery-pairing-fanout

Conversation

@ankitgoswami
Copy link
Copy Markdown
Contributor

@ankitgoswami ankitgoswami commented Jun 2, 2026

Summary

Extends the existing Find miners → Scan your network action so the server, in addition to its own scan, fans an nmap sweep out to every CONFIRMED + connected fleet node (each scans its own local subnet and reports back), merging all results into the existing Discover stream — no client/proto changes.

radar sweep

What changed (server-side)

  • domain/fleetnode/discovery (new) — shared per-node dispatch/drain loop (RunOnNode) and targeting (ConfirmedConnectedNodeIDs = CONFIRMED ∩ connected). The admin DiscoverOnFleetNode handler delegates to it. Adds control.Registry.ConnectedFleetNodeIDs().
  • Discover fan-out, gated to the automatic scan — fan-out runs only when the nmap target is the cloud's own local subnet (Service.IsLocalSubnetScan, the same signal that drives known-subnet expansion), i.e. the "Scan your network" button. A manual/explicit nmap target stays cloud-only. Sources merge into one stream via a serialized, cross-source-deduped send; zero connected nodes ⇒ unchanged behavior; one node failing never fails the scan.
  • Auto local-subnet scan — nodes scan their own private IPv4 subnet(s) via the reserved nmaptarget.LocalSubnetTarget sentinel. Detection filters virtual/loopback interfaces, narrows masks wider than /22, and caps the total swept addresses across all detected subnets at discoverylimits.MaxScanTargets (not just per-subnet).
  • Cancellation maps to CodeCanceled / CodeDeadlineExceeded (not Internal or a silent nil success); fan-out node failures stay quiet on operator disconnect.

Testing

  • go build ./..., go vet ./..., golangci-lint (full server): clean.
  • Unit: local-subnet detection (incl. total-host budget), RunOnNode (forward/partial/disconnect/cancel), ConfirmedConnectedNodeIDs, ConnectedFleetNodeIDs, IsLocalSubnetScan, auto report-scope/normalize, buildNmapOptions sentinel path.
  • DB-backed: refactored admin handler + pairing domain (incl. the preserved fleet-node-refusal test).

🤖 Generated with Claude Code

…inline

Extend the existing "Find miners / Scan your network" flow to also discover
and pair miners behind remote fleet nodes, with no client/UI changes. The
existing client keeps calling Discover (nmap) and Pair; the server now does
the fan-out and routing transparently.

- New domain/fleetnode/discovery package owns the per-node "send command +
  drain batches until ack" loop (RunOnNode) and node targeting
  (ConfirmedConnectedNodeIDs). The admin DiscoverOnFleetNode handler now
  delegates to it; adds control.Registry.ConnectedFleetNodeIDs().
- PairingService.Discover fans an nmap scan out to every CONFIRMED + connected
  fleet node in parallel, merging all sources into one response stream
  (mutex-guarded send + cross-source dedup). Zero connected nodes == unchanged
  behavior; one node failing never fails the scan.
- Fleet nodes scan their own private IPv4 subnet(s) via the reserved
  nmaptarget.LocalSubnetTarget ("fleetnode-local-subnet") target sentinel,
  since the cloud can't know each node's network. Detection narrows masks
  wider than /22, dedupes, caps at 8 subnets, and is IPv4-only.
- PairingService.Pair routes fleet-node-discovered devices to the
  operator-confirmed ownership assignment (FleetNodeAssigner) instead of
  refusing them, so mixed selections pair in one action. Metadata only; the
  owning node dials and credentials the miner per RFC 0001.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ankitgoswami ankitgoswami requested a review from a team as a code owner June 2, 2026 22:19
Copilot AI review requested due to automatic review settings June 2, 2026 22:19
@github-actions github-actions Bot added the server label Jun 2, 2026
@ankitgoswami ankitgoswami marked this pull request as draft June 2, 2026 22:22
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a309f6cf26

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/internal/handlers/pairing/handler.go Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

🔐 Codex Security Review

Note: This is an automated security-focused code review generated by Codex.
It should be used as a supplementary check alongside human review.
False positives are possible - use your judgment.

Scope summary

  • Reviewed pull request diff only (c7cfb03a64a41e1230011c82e29d3da54d30d853...c87b7022587c44736eda511ea4b9704ce8ab506f, exact PR three-dot diff)
  • Model: gpt-5.5

💡 Click "edited" above to see previous reviews for this PR.


Review Summary

Overall Risk: MEDIUM

Findings

[MEDIUM] Local-subnet fleet-node discovery accepts reports from any private address

  • Category: Network Discovery
  • Location: server/internal/domain/fleetnode/discovery/reportscope.go:33
  • Description: The new LocalSubnetTarget report scope cannot bind reports to the fleet node’s actual scanned subnet, so it accepts any private IP address on an allowed port. The agent-side scan is limited to a detected private IPv4 subnet, but the server-side admission check allows arbitrary RFC1918/RFC4193 addresses, including addresses outside that node’s subnet.
  • Impact: A compromised or buggy fleet node can poison discovery results with devices from unrelated private ranges, potentially claiming or refreshing bare discovered-device rows and misleading operators into pairing or managing the wrong endpoint.
  • Recommendation: Bind local-subnet reports to configured/approved per-fleet-node discovery CIDRs. If those do not exist yet, add an explicit node discovery-scope configuration or handshake metadata and reject reports outside that scope; also keep local-subnet nmap reports IPv4-only.

[MEDIUM] Fleet-node fan-out launches unbounded per-node discovery work

  • Category: Reliability
  • Location: server/internal/handlers/pairing/handler.go:112
  • Description: For local-subnet nmap discovery, the handler starts one goroutine and one discovery command per confirmed connected fleet node without a per-request cap or concurrency limit. Each node can run until RunOnNode’s 12-minute timeout.
  • Impact: A single permitted request can trigger scans across every connected fleet node and keep the server stream/goroutines open for minutes. In large deployments or repeated requests, this can create avoidable server load, tie up per-node command slots, and generate broad network scan traffic.
  • Recommendation: Add a bounded worker pool and a configurable maximum fan-out per request/org. Consider a shorter fan-out budget independent of the cloud scan, partial-result reporting, and per-org rate limiting for this operation.

Notes

No cryptostealing/pool-hijack, SQL injection, protobuf wire-format, or obvious auth bypass issues were found in the scoped diff. The new fan-out path is gated on both miner:pair and fleetnode:manage, which is the right permission boundary.


Generated by Codex Security Review |
Triggered by: @ankitgoswami |
Review workflow run

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the server-side “Find miners → Scan your network” flow to fan out Nmap discovery to connected + CONFIRMED fleet nodes, merge those results into the existing Discover stream, and enable inline pairing of fleet-node-discovered devices via an optional assigner interface—without any client/proto changes.

Changes:

  • Introduces domain/fleetnode/discovery to own the per-node dispatch/drain loop (RunOnNode) and targeting (ConfirmedConnectedNodeIDs), reused by both the admin RPC and pairing handler fan-out.
  • Updates PairingService.Discover to merge cloud discovery + fleet-node local-subnet scans into one stream with serialized Send + cross-source dedupe.
  • Updates PairDevices to route fleet-node-origin devices to a FleetNodeAssigner (wired to fleetnode/pairing.Service) instead of refusing them.

Reviewed changes

Copilot reviewed 22 out of 23 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
server/internal/testutil/infrastructure_provider.go Updates test harness to construct pairing handler with nil discovery fan-out.
server/internal/handlers/pairing/handler.go Fans out Nmap discovery to connected fleet nodes and merges results into one stream.
server/internal/handlers/fleetnode/admin/handler.go Refactors admin single-node discovery to delegate to discovery service.
server/internal/handlers/fleetnode/admin/handler_pairing_test.go Wires discovery service into admin handler tests.
server/internal/handlers/fleetnode/admin/handler_discover_test.go Updates timeout override to new discovery package variable.
server/internal/domain/pairing/service.go Adds FleetNodeAssigner and routes remote-origin devices through assignment.
server/internal/domain/pairing/service_pairrouting_test.go Adds unit tests for fleet-node pairing routing and partial success.
server/internal/domain/pairing/mocks/mock_service.go Regenerates mocks to include MockFleetNodeAssigner.
server/internal/domain/nmaptarget/nmaptarget.go Adds reserved LocalSubnetTarget sentinel constant.
server/internal/domain/fleetnode/discovery/service.go New shared discovery dispatch/drain service for fleet nodes.
server/internal/domain/fleetnode/discovery/service_test.go Unit tests for RunOnNode behavior and node targeting.
server/internal/domain/fleetnode/discovery/reportscope.go Moves/report-scope logic into discovery package and handles sentinel target.
server/internal/domain/fleetnode/discovery/reportscope_test.go Updates report-scope tests and adds sentinel coverage.
server/internal/domain/fleetnode/discovery/iprange_test.go Moves IP range tests under discovery package.
server/internal/domain/fleetnode/discovery/ackfailure_test.go Moves ack-failure tests under discovery package.
server/internal/domain/fleetnode/control/registry.go Adds ConnectedFleetNodeIDs() for fan-out targeting.
server/internal/domain/fleetnode/control/registry_test.go Adds tests for ConnectedFleetNodeIDs().
server/cmd/fleetnode/run.go Adds test seam for local subnet detection in the agent command runner.
server/cmd/fleetnode/nmap.go Handles LocalSubnetTarget by detecting local subnets and scanning those CIDRs.
server/cmd/fleetnode/nmap_test.go Adds tests for LocalSubnetTarget option building and incapability handling.
server/cmd/fleetnode/localsubnet.go New local subnet selection logic (filter/dedupe/cap/narrow).
server/cmd/fleetnode/localsubnet_test.go Unit tests for local subnet detection behavior.
server/cmd/fleetd/main.go Wires discovery service into pairing handler and admin handler; wires pairing assigner.
Files not reviewed (1)
  • server/internal/domain/pairing/mocks/mock_service.go: Language not supported

Comment thread server/internal/handlers/pairing/handler.go Outdated
Comment thread server/internal/handlers/pairing/handler.go Outdated
Comment thread server/internal/domain/fleetnode/discovery/service.go
Comment thread server/cmd/fleetnode/run.go Outdated
Comment thread server/cmd/fleetnode/localsubnet.go Outdated
Comment thread server/internal/domain/nmaptarget/nmaptarget.go Outdated
ankitgoswami and others added 2 commits June 2, 2026 15:25
The auto_local_subnet bool was replaced by the LocalSubnetTarget sentinel;
update the comments and two test names that still referenced the old field.
No behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… Pair routing

- Revert Part C: the cloud Pair routing passed discovered_device.id where
  fleetnode/pairing.PairDevice expects a device.id, and fleet-node discoveries
  have no device row to assign (the cloud can't dial them to create one, and
  reports carry no MAC). Restore the safe refusal; inline pairing of fleet-node
  miners needs the node-side device-creation flow (follow-up).
- Gate the fleet-node fan-out to the automatic "Scan your network" action (nmap
  target == the cloud's own local subnet) so a manual/explicit nmap scan no
  longer also sweeps every connected node's LAN. Adds Service.IsLocalSubnetScan.
- Cap the agent's local-subnet auto-scan to a total host budget
  (discoverylimits.MaxScanTargets) across all detected subnets, not just /22 per
  subnet, so a multi-homed node can't 8x-oversize one command.
- Map caller cancellation to CodeCanceled in RunOnNode and in Discover (was
  Internal / silent nil), and suppress the fan-out WARN when the stream is
  already cancelled (expected operator disconnect).
- Fix the LocalSubnetTarget comment: IPv4 private space is RFC1918, not RFC4193.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ankitgoswami ankitgoswami changed the title feat(discovery): fan out nmap discovery to fleet nodes and pair them inline feat(discovery): fan out "Scan your network" nmap to connected fleet nodes Jun 2, 2026
ankitgoswami and others added 8 commits June 2, 2026 15:54
Replace the agent's bespoke per-NIC private-subnet enumeration with
networking.GetLocalNetworkInfo() — the same primary-interface detection the
cloud/combined Discover path already uses, so the node scans the same way
combined mode does. Intentionally less robust (picks one interface, no
virtual-NIC filtering, no /22 narrowing or total-host cap); parity is enough
for now and hardening is a follow-up. Removes selectLocalPrivateSubnets and
its tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
After the combined-mode parity change, localsubnet.go held only errNoLocalSubnet
and a small detectLocalSubnets wrapper used solely by buildNmapOptions. Move both
next to their caller in nmap.go and drop the now-empty file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… dedup key, tests)

From the multi-agent code review (no correctness defects found):

- Cap fan-out concurrency: a maxConcurrentFleetNodeScans (32) semaphore bounds
  in-flight per-node ControlStream commands so a large fleet can't spawn an
  unbounded number of slots; goroutines acquire via a streamCtx-aware select so
  they exit on operator disconnect.
- Extract pairing.DeviceDedupKey and use it in both dedupeDiscoverResponses and
  the Discover handler's cross-source send closure, so the dedup identity stays
  in lockstep instead of being re-derived.
- Add service-level tests: RunOnNode treats an onBatch error as terminal,
  RunOnNode returns CodeDeadlineExceeded when the agent never acks, and
  ConfirmedConnectedNodeIDs propagates a ListFleetNodes error.
- Minor comment polish (detectLocalSubnets doc, DiscoverOnFleetNode godoc).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses the residual findings from the multi-agent review:

- #4/#7: DiscoverWithNmap now reports whether the target is the cloud's own
  local subnet (resolveNmapTargets already computed it); the Discover handler
  gates fan-out on that return value instead of a second GetLocalNetworkInfo
  call. Removes IsLocalSubnetScan and the duplicate ip-route fork.
- #1: extract the mutex-guarded dedup/serialize send into dedupForwarder and
  unit-test it (cross-source dedup, ip:port fallback, all-duplicate drop,
  send-error cancel, concurrency under -race).
- #3: bound the fan-out with fleetNodeFanOutTimeout (5m) so one wedged node
  can't extend the operator's wait to the full per-node 12m budget.
- #8: discovery.Service depends on a narrow nodeRegistry interface instead of
  *control.Registry (mirrors the nodeLister idiom; eases testing).
- #13: control.Registry.mu is now sync.RWMutex; ConnectedFleetNodeIDs and
  ReportScopeFor take read locks.
- #12: test that the LocalSubnetTarget sentinel dispatches through RunOnNode
  (the shared single-node + fan-out path).

Plus service-level tests for the new DiscoverWithNmap/resolveNmapTargets bool.
No behavior change for cloud-only discovery or existing single-node discovery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Compress the fan-out comments and const docs to the load-bearing "why" —
no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Peel back machinery added reactively to low-priority review findings, none of
which earns its complexity at the actual fleet scale:

- Remove fleetNodeFanOutTimeout: it was a third timeout layer (atop the agent's
  10m and RunOnNode's 12m) with an arbitrary value that second-guessed the
  agent's own budget. Discover streams incrementally, so a wedged node only
  delays the stream close, not results. RunOnNode's per-node timeout is the
  single, sufficient bound.
- Remove the maxConcurrentFleetNodeScans semaphore: inert at typical fleet
  sizes (tens of nodes); the per-node timeout already bounds each node.
- Revert control.Registry to sync.Mutex: the RWMutex was a P3 micro-opt for
  contention that doesn't exist at this scale and added lock-discipline risk.

No behavior change for the realistic path; fan-out goroutines run on streamCtx
and remain bounded by RunOnNode's per-node timeout.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ConnectedFleetNodeIDs iterates a map (order unspecified), so assert the
ConfirmedConnectedNodeIDs result with ElementsMatch rather than Equal to match
the contract and avoid future flakiness. Also tighten the RunOnNode handleEvent
comment to the non-obvious terminal-with-nil-err semantic.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PairingService.Discover fan-out enumerated all confirmed connected fleet
nodes and issued discovery commands over their control streams while gated
only by miner:pair. The single-node DiscoverOnFleetNode path requires
fleetnode:manage, so this was a weaker authorization path to the same
fleet-node command surface.

Fan-out now also requires fleetnode:manage via a non-failing check, so
miner:pair-only callers still get cloud-only discovery but cannot drive
discovery commands on fleet nodes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ankitgoswami ankitgoswami marked this pull request as ready for review June 3, 2026 18:56
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ada7392f4f

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/cmd/fleetnode/nmap.go
The local-subnet sentinel passed detectLocalSubnets() output straight to
nmap. That detection reuses primary-interface logic that doesn't filter for
RFC1918, so a node whose primary NIC is public could have that subnet
scanned by an automatic "Scan your network". The server drops non-private
reports, but only after the node has already sent the scan traffic.

Validate each detected prefix is a private IPv4 subnet before scanning and
fail AGENT_INCAPABLE otherwise, so a fan-out skips such a node.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 35dc9be7d6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/cmd/fleetnode/nmap.go
The local-subnet sentinel validated detected subnets were private but not
their breadth, so a node whose primary NIC has a prefix broader than the
/22 nmap limit (e.g. 10.0.0.0/16, returned directly from the OS interface
mask) could be made to sweep tens of thousands of hosts per fan-out,
defeating the cap that applies to operator-supplied targets.

validateLocalSubnetTarget now also rejects prefixes broader than
nmaptarget.MinIPv4PrefixBits (AGENT_INCAPABLE), mirroring nmaptarget.Validate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c87b702258

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/internal/handlers/pairing/handler.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants