Skip to content

ntpd: configured server peer on directly-connected subnet never reaches sync; pool-resolved peers on the same daemon work fine #10263

@lloydaviation

Description

@lloydaviation

Important notices

Before you add a new report, we ask you kindly to acknowledge the following:

Describe the bug

A manually configured server entry pointing at a stratum-1 NTP server on a directly-connected LAN subnet never reaches sync. The same ntpd instance, with the same restrict default flags, successfully syncs to internet pool servers. ntpdate -q against the same server from the same shell works perfectly. tcpdump confirms NTP query/response packets flow correctly on the wire, sub-millisecond RTT. sockstat confirms ntpd is bound to the correct interface IP. But the configured association never advances out of mobilize state — reach stays 0, peer mode (pmode) never advances from 0, and flash shows only symptomatic bits (no actual rejection bits).

The signature: configured (conf=yes) peers stuck at mobilize, while pool-resolved (conf=no) peers on the same daemon reach reachable normally. Pool-resolved peers spawn ephemeral associations through a different code path and are not affected.

Last known working version: not known. The local NTP server was newly placed on this VLAN in this report's testing window; no prior version of OPNsense has been observed to handle this configuration successfully on this system.

To Reproduce

Steps to reproduce the behavior:

  1. On an OPNsense router with multiple VLAN interfaces (and ntpd bound to multiple interface listen IPs by virtue of those VLANs), have a local NTP server reachable on a directly-connected LAN subnet (in this report: a stratum-1 GPS-disciplined server on VLAN 42 at 172.16.42.7; OPNsense's interface IP on that VLAN is 172.16.42.1).
  2. Go to Services → Network Time → General.
  3. Under Time Servers, add 172.16.42.7 (literal IP, no DNS) with iburst and prefer checked.
  4. Save and Apply.
  5. Wait 5–10 minutes, then run ntpq -c as and ntpq -c "rv <assid>" for the new association.
  6. Observe: configured peer remains stuck at reach=0 / condition=reject / last_event=mobilize, while default pool peers progress to last_event=reachable.

Expected behavior

ntpd should query 172.16.42.7 (mode 3 client), receive mode 4 server responses, advance the association, build reach to 377 (octal), and mark the peer selectable. The same way it handles pool-resolved peers on this same daemon.

Describe alternatives you considered

  • ntpdate from cron: works because ntpdate uses an ephemeral source port (the only mechanical difference between the two clients to this server — ntpd uses UDP/123 as source per NTP symmetric port behavior). Coarse-grained, but functional. Acceptable workaround until ntpd is fixed.
  • Switching to chrony: not available — neither as plugin nor in the FreeBSD pkg repos accessible from OPNsense in 26.1.6. (os-chrony was a community plugin at one point but is not in the current set.)
  • Direct pkg install of chrony or openntpd: rejected because installing packages outside OPNsense's plugin system risks destabilization on firmware updates, which is unacceptable for an appliance role.
  • Removing the configured peer and relying on pool peers: this works for general clock discipline (the router's clock is being kept reasonably accurate from internet pools right now), but loses the benefit of the local stratum-1 GPS-disciplined server, which was the entire motivation for adding the entry.

Screenshots

Not applicable — this issue manifests in CLI/log output rather than the GUI. All relevant data is in the log section below.

Relevant log files

ntpq -c as (after several poll cycles past iburst):

ind assid status  conf reach auth condition  last_event cnt
  1 26023  8011   yes    no  none    reject    mobilize  1   <-- 172.16.42.7 (configured server)
  2 26024  8811   yes  none  none    reject    mobilize  1   <-- pool driver
  3 26025  8811   yes  none  none    reject    mobilize  1   <-- pool driver
  4 26026  8811   yes  none  none    reject    mobilize  1   <-- pool driver
  5 26027  1024    no   yes  none    reject   reachable  2   <-- pool-resolved
  6 26028  1014    no   yes  none    reject   reachable  1
 ...                                                          (28 pool-resolved peers total)
 32 26054  1014    no   yes  none    reject   reachable  1

All 4 configured (conf=yes) peers stuck at mobilize; all 28 pool-resolved (conf=no) peers reach reachable.

ntpq -c "rv 26023" (the configured 172.16.42.7 association):

associd=26023 status=8011 conf, sel_reject, 1 event, mobilize,
srcadr=ntp-server.example.com, srcport=123, dstadr=172.16.42.1,
dstport=123, leap=11, stratum=16, precision=-24, rootdelay=0.000,
rootdisp=0.000, refid=STEP, reftime=(no time),
rec=eda49c0b.3944a940, reach=000, unreach=6, hmode=3, pmode=0,
hpoll=6, ppoll=9, headway=40,
flash=1600 peer_stratum, peer_dist, peer_unreach, keyid=0,
offset=+0.000, delay=0.000, dispersion=15937.500, jitter=0.000

flash=0x1600 contains only symptomatic bits (peer_stratum, peer_dist, peer_unreach). No actual rejection bits set: no TEST2 (bogus / origin-timestamp-mismatch, 0x0002), no TEST4 (access denied, 0x0008), no TEST5 (auth failure, 0x0010), no TEST6 (peer mode error, 0x0020). The bits that are lit are the kind that get marked because no valid response has been processed, not because a response was specifically rejected.

pmode=0 confirms ntpd has never advanced this association from its initial state.

tcpdump -i vlan03 -nn 'host 172.16.42.7 and udp port 123' (vlan03 is the OS-level interface for VLAN 42):

15:53:26.860080 IP 172.16.42.1.123 > 172.16.42.7.123: NTPv4, Client, length 48
15:53:26.860732 IP 172.16.42.7.123 > 172.16.42.1.123: NTPv4, Server, length 48
15:54:24.531834 IP 172.16.42.1.123 > 172.16.42.7.123: NTPv4, Client, length 48
15:54:24.532676 IP 172.16.42.7.123 > 172.16.42.1.123: NTPv4, Server, length 48

Standard 48-byte NTPv4 packets. Outbound mode 3 client query from 172.16.42.1:123, inbound mode 4 server response from 172.16.42.7:123. Sub-millisecond RTT. The wire is healthy.

tcpdump -i pflog0 -nn 'host 172.16.42.7' shows only outbound queries logged (matching a pass log rule for self-traffic). Inbound responses are not in pflog0 — they are matching pf state from outbound queries and pass without re-evaluation. pf is not blocking the responses.

sockstat -4 -P udp | grep 123:

root  ntpd 4933 20  udp4  172.16.48.1:123    *:*
root  ntpd 4933 23  udp4  127.0.0.1:123      *:*
root  ntpd 4933 24  udp4  172.16.10.1:123    *:*
root  ntpd 4933 25  udp4  172.16.42.1:123    *:*
root  ntpd 4933 28  udp4  172.16.50.1:123    *:*

fd=25 is the socket bound to 172.16.42.1:123 — exactly the address inbound responses are destined to. The packets reach this socket but never advance assID 26023.

ntpdate -q 172.16.42.7 from the same OPNsense shell:

server 172.16.42.7, stratum 1, offset -0.000124, delay 0.02577
 5 May 16:27:04 ntpdate[71671]: adjust time server 172.16.42.7 offset -0.000124 sec

The mechanical difference between ntpdate and ntpd as clients to this server: ntpdate uses an ephemeral source port; ntpd uses UDP/123 as source (NTP symmetric port behavior).

/var/etc/ntpd.conf (relevant lines):

server 172.16.42.7 iburst maxpoll 9 prefer
pool 0.pool.ntp.org maxpoll 9
pool 1.pool.ntp.org maxpoll 9
pool 2.ntp.pool.org maxpoll 9

Eliminated as cause during diagnosis:

  • Network reachability: ping works from clients and from OPNsense shell.
  • DNS: server entry is the literal IP, not a hostname.
  • Authentication: keyid=0 on the association.
  • NAT: pfctl -sn shows no rules touching port 123.
  • Firewall blocks: pflog0 shows no inbound blocks; pf state-tracking lets responses through.
  • Socket binding: ntpd is bound to 172.16.42.1:123 (sockstat confirms).
  • Config syntax: directive is server, not peer.
  • Restrict default flags: OPNsense defaults (kod limited nomodify noquery nopeer notrap) — none should affect server responses to client queries; nopeer would only affect new symmetric-mode association attempts and shouldn't apply to configured server peers.
  • ntpd not crashing or restarting (single PID stable across observation; no respawn).
  • iburst toggle: no effect.

Additional context

The configured-vs-pool split is the smoking gun: same ntpd, same restrict, same firewall, same wire path — only configured server peers fail to mobilize. Pool-resolved peers on the same daemon work normally.

Hypothesis: specific to ntpd's source-interface selection or association-matching logic for configured server peers on a directly-connected subnet, when ntpd has multiple interface listen bindings (one per VLAN interface IP). Pool-resolved ephemeral associations follow a different internal code path and don't trip the same condition.

Environment

OPNsense 26.1.6 (amd64).
Default ntpd (ISC) via Services → Network Time, multi-VLAN setup with ntpd bound to multiple interface listen addresses (172.16.48.1, 172.16.10.1, 172.16.42.1, 172.16.50.1, plus IPv6 equivalents). Multi-WAN gateway configuration (irrelevant — issue affects same-VLAN traffic with no WAN involvement).

Metadata

Metadata

Assignees

No one assigned

    Labels

    supportCommunity support or awaiting triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions