ntpd: configured server peer on directly-connected subnet never reaches sync; pool-resolved peers on the same daemon work fine

**Important notices**

Before you add a new report, we ask you kindly to acknowledge the following:

- [x] I have read the contributing guidelines at https://github.com/opnsense/core/blob/master/CONTRIBUTING.md
- [x] I am convinced that my issue is new after having checked both open and closed issues at https://github.com/opnsense/core/issues?q=is%3Aissue

**Describe the bug**

A manually configured `server` entry pointing at a stratum-1 NTP server on a directly-connected LAN subnet never reaches sync. The same ntpd instance, with the same `restrict default` flags, successfully syncs to internet pool servers. `ntpdate -q` against the same server from the same shell works perfectly. tcpdump confirms NTP query/response packets flow correctly on the wire, sub-millisecond RTT. `sockstat` confirms ntpd is bound to the correct interface IP. But the configured association never advances out of `mobilize` state — reach stays 0, peer mode (`pmode`) never advances from 0, and `flash` shows only symptomatic bits (no actual rejection bits).

The signature: configured (`conf=yes`) peers stuck at `mobilize`, while pool-resolved (`conf=no`) peers on the same daemon reach `reachable` normally. Pool-resolved peers spawn ephemeral associations through a different code path and are not affected.

Last known working version: not known. The local NTP server was newly placed on this VLAN in this report's testing window; no prior version of OPNsense has been observed to handle this configuration successfully on this system.

**To Reproduce**

Steps to reproduce the behavior:

1. On an OPNsense router with multiple VLAN interfaces (and ntpd bound to multiple `interface listen` IPs by virtue of those VLANs), have a local NTP server reachable on a directly-connected LAN subnet (in this report: a stratum-1 GPS-disciplined server on VLAN 42 at 172.16.42.7; OPNsense's interface IP on that VLAN is 172.16.42.1).
2. Go to **Services → Network Time → General**.
3. Under Time Servers, add `172.16.42.7` (literal IP, no DNS) with `iburst` and `prefer` checked.
4. Save and Apply.
5. Wait 5–10 minutes, then run `ntpq -c as` and `ntpq -c "rv <assid>"` for the new association.
6. Observe: configured peer remains stuck at `reach=0` / `condition=reject` / `last_event=mobilize`, while default pool peers progress to `last_event=reachable`.

**Expected behavior**

ntpd should query 172.16.42.7 (mode 3 client), receive mode 4 server responses, advance the association, build reach to 377 (octal), and mark the peer selectable. The same way it handles pool-resolved peers on this same daemon.

**Describe alternatives you considered**

- **`ntpdate` from cron**: works because `ntpdate` uses an ephemeral source port (the only mechanical difference between the two clients to this server — `ntpd` uses UDP/123 as source per NTP symmetric port behavior). Coarse-grained, but functional. Acceptable workaround until ntpd is fixed.
- **Switching to chrony**: not available — neither as plugin nor in the FreeBSD pkg repos accessible from OPNsense in 26.1.6. (`os-chrony` was a community plugin at one point but is not in the current set.)
- **Direct `pkg install` of chrony or openntpd**: rejected because installing packages outside OPNsense's plugin system risks destabilization on firmware updates, which is unacceptable for an appliance role.
- **Removing the configured peer and relying on pool peers**: this works for general clock discipline (the router's clock is being kept reasonably accurate from internet pools right now), but loses the benefit of the local stratum-1 GPS-disciplined server, which was the entire motivation for adding the entry.

**Screenshots**

Not applicable — this issue manifests in CLI/log output rather than the GUI. All relevant data is in the log section below.

**Relevant log files**

`ntpq -c as` (after several poll cycles past iburst):

```
ind assid status  conf reach auth condition  last_event cnt
  1 26023  8011   yes    no  none    reject    mobilize  1   <-- 172.16.42.7 (configured server)
  2 26024  8811   yes  none  none    reject    mobilize  1   <-- pool driver
  3 26025  8811   yes  none  none    reject    mobilize  1   <-- pool driver
  4 26026  8811   yes  none  none    reject    mobilize  1   <-- pool driver
  5 26027  1024    no   yes  none    reject   reachable  2   <-- pool-resolved
  6 26028  1014    no   yes  none    reject   reachable  1
 ...                                                          (28 pool-resolved peers total)
 32 26054  1014    no   yes  none    reject   reachable  1
```

All 4 configured (`conf=yes`) peers stuck at `mobilize`; all 28 pool-resolved (`conf=no`) peers reach `reachable`.

`ntpq -c "rv 26023"` (the configured 172.16.42.7 association):

```
associd=26023 status=8011 conf, sel_reject, 1 event, mobilize,
srcadr=ntp-server.example.com, srcport=123, dstadr=172.16.42.1,
dstport=123, leap=11, stratum=16, precision=-24, rootdelay=0.000,
rootdisp=0.000, refid=STEP, reftime=(no time),
rec=eda49c0b.3944a940, reach=000, unreach=6, hmode=3, pmode=0,
hpoll=6, ppoll=9, headway=40,
flash=1600 peer_stratum, peer_dist, peer_unreach, keyid=0,
offset=+0.000, delay=0.000, dispersion=15937.500, jitter=0.000
```

`flash=0x1600` contains only symptomatic bits (`peer_stratum`, `peer_dist`, `peer_unreach`). **No actual rejection bits set:** no TEST2 (`bogus` / origin-timestamp-mismatch, 0x0002), no TEST4 (`access denied`, 0x0008), no TEST5 (`auth failure`, 0x0010), no TEST6 (`peer mode error`, 0x0020). The bits that are lit are the kind that get marked because no valid response has been processed, not because a response was specifically rejected.

`pmode=0` confirms ntpd has never advanced this association from its initial state.

`tcpdump -i vlan03 -nn 'host 172.16.42.7 and udp port 123'` (vlan03 is the OS-level interface for VLAN 42):

```
15:53:26.860080 IP 172.16.42.1.123 > 172.16.42.7.123: NTPv4, Client, length 48
15:53:26.860732 IP 172.16.42.7.123 > 172.16.42.1.123: NTPv4, Server, length 48
15:54:24.531834 IP 172.16.42.1.123 > 172.16.42.7.123: NTPv4, Client, length 48
15:54:24.532676 IP 172.16.42.7.123 > 172.16.42.1.123: NTPv4, Server, length 48
```

Standard 48-byte NTPv4 packets. Outbound mode 3 client query from 172.16.42.1:123, inbound mode 4 server response from 172.16.42.7:123. Sub-millisecond RTT. The wire is healthy.

`tcpdump -i pflog0 -nn 'host 172.16.42.7'` shows only outbound queries logged (matching a `pass log` rule for self-traffic). Inbound responses are not in pflog0 — they are matching pf state from outbound queries and pass without re-evaluation. **pf is not blocking the responses.**

`sockstat -4 -P udp | grep 123`:

```
root  ntpd 4933 20  udp4  172.16.48.1:123    *:*
root  ntpd 4933 23  udp4  127.0.0.1:123      *:*
root  ntpd 4933 24  udp4  172.16.10.1:123    *:*
root  ntpd 4933 25  udp4  172.16.42.1:123    *:*
root  ntpd 4933 28  udp4  172.16.50.1:123    *:*
```

fd=25 is the socket bound to `172.16.42.1:123` — exactly the address inbound responses are destined to. The packets reach this socket but never advance assID 26023.

`ntpdate -q 172.16.42.7` from the same OPNsense shell:

```
server 172.16.42.7, stratum 1, offset -0.000124, delay 0.02577
 5 May 16:27:04 ntpdate[71671]: adjust time server 172.16.42.7 offset -0.000124 sec
```

The mechanical difference between `ntpdate` and `ntpd` as clients to this server: **`ntpdate` uses an ephemeral source port; `ntpd` uses UDP/123 as source (NTP symmetric port behavior).**

`/var/etc/ntpd.conf` (relevant lines):

```
server 172.16.42.7 iburst maxpoll 9 prefer
pool 0.pool.ntp.org maxpoll 9
pool 1.pool.ntp.org maxpoll 9
pool 2.ntp.pool.org maxpoll 9
```

Eliminated as cause during diagnosis:

- Network reachability: `ping` works from clients and from OPNsense shell.
- DNS: server entry is the literal IP, not a hostname.
- Authentication: `keyid=0` on the association.
- NAT: `pfctl -sn` shows no rules touching port 123.
- Firewall blocks: pflog0 shows no inbound blocks; pf state-tracking lets responses through.
- Socket binding: ntpd is bound to 172.16.42.1:123 (sockstat confirms).
- Config syntax: directive is `server`, not `peer`.
- Restrict default flags: OPNsense defaults (`kod limited nomodify noquery nopeer notrap`) — none should affect server responses to client queries; `nopeer` would only affect new symmetric-mode association attempts and shouldn't apply to configured `server` peers.
- ntpd not crashing or restarting (single PID stable across observation; no respawn).
- `iburst` toggle: no effect.

**Additional context**

The configured-vs-pool split is the smoking gun: same ntpd, same restrict, same firewall, same wire path — only configured `server` peers fail to mobilize. Pool-resolved peers on the same daemon work normally.

Hypothesis: specific to ntpd's source-interface selection or association-matching logic for configured `server` peers on a directly-connected subnet, when ntpd has multiple `interface listen` bindings (one per VLAN interface IP). Pool-resolved ephemeral associations follow a different internal code path and don't trip the same condition.

**Environment**

OPNsense 26.1.6 (amd64).
Default ntpd (ISC) via Services → Network Time, multi-VLAN setup with ntpd bound to multiple `interface listen` addresses (172.16.48.1, 172.16.10.1, 172.16.42.1, 172.16.50.1, plus IPv6 equivalents). Multi-WAN gateway configuration (irrelevant — issue affects same-VLAN traffic with no WAN involvement).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ntpd: configured server peer on directly-connected subnet never reaches sync; pool-resolved peers on the same daemon work fine #10263

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ntpd: configured server peer on directly-connected subnet never reaches sync; pool-resolved peers on the same daemon work fine #10263

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions