-
Notifications
You must be signed in to change notification settings - Fork 142

Description
I have been looking over the NTP Pool status page and I noticed that the peak is reported as 145,260 DNS queries per second. Assuming each of these queries returns four NTP server IPs, and that each of those IPs is then queried for the time, I tried to estimate the traffic involved.
Using 114 bytes per query as a reasonable estimate (I am aware that the actual NTP request and response will be smaller than this), 145,260 DNS queries per second returning four IPs works out at about 0.53 Gbps, or roughly 530 Mbps.
That does not seem like a large amount of traffic for a global free NTP service. Am I misunderstanding the calculation somewhere?
With 3,601 IPv4 NTP servers in the pool running at the lowest bandwidth setting of 512 kbps per server, the total capacity is about 1.8 Gbps, or roughly 3.5 the total traffic demand. For the 2,084 IPv6 servers, the total capacity is about 1.1 Gbps, or a little over twice the total traffic demand.
This suggests that the pool is not short of volunteers. My initial thought was that the wide allowance for round trip time and offset was because there were not enough participants, but it seems this is not the case.
From my own checks, I have found around 10 percent of servers in the UK pool fall into one of the following categories:
-
circular peering, where the client references servers already in the pool, which in turn reference other servers in the pool
-
high round trip times (over 250 ms)
-
accuracy outside of 100 ms
-
non synchronous routing, where incoming and outgoing packets have different timings
I had assumed the NTP Pool would filter out such servers, but from my tests I am still quite often receiving them as part of the DNS responses.
I would like to suggest that the allowance rules are reviewed and perhaps tightened. Since the pool currently seems to have far more capacity than required, it should be possible to trim inaccurate or unstable servers without affecting the ability to provide NTP as a free service.
For my own testing, I queried 0.uk.pool.ntp.org, 1.uk.pool.ntp.org, 2.uk.pool.ntp.org, and 3.uk.pool.ntp.org every five minutes. I took all four IP addresses returned each time, so sixteen time servers in total, and queried them for the time. Over 1,440 probes, I sent 23,040 NTP requests. Out of these, 2,256 responses were off by more than 50 ms, and 1,408 were off by more than 100 ms. This means 9.8 percent of responses were over 50 ms wrong, and 6.1 percent were over 100 ms wrong.
If the rules were changed so that any server which goes out by more than 50 ms is removed immediately (not stepped down slowly), the pool size would only fall by less than 10 percent. Servers could then regain their status gradually as they provide correct responses. At present it seems unusual that there are strict rules to get into the pool, but once in, a server can drift significantly without being removed from DNS responses unless it is consistently bad for nearly a full day.
In my sample image, the NTP responses exceeding 50 ms are highlighted in red.
On the right side, you can observe where one pool server drifted, and the servers using the same upstream source drifted along with it, forming the visible arch.
