Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS requests for rewrites and DHCP fail over UDP but works over TCP #7038

Open
4 tasks done
billyzs opened this issue May 30, 2024 · 1 comment
Open
4 tasks done

DNS requests for rewrites and DHCP fail over UDP but works over TCP #7038

billyzs opened this issue May 30, 2024 · 1 comment

Comments

@billyzs
Copy link

billyzs commented May 30, 2024

Prerequisites

Platform (OS and CPU architecture)

Darwin (aka macOS), AMD64 (aka x86_64)

Installation

Custom package (OpenWrt, HomeAssistant, etc; please mention in the description)

Setup

On a Server, DHCP is handled by AdGuard Home

AdGuard Home version

0.107.50

Action

Hello AdGuard team,

I have an AGH instance configured correctly on TrueNAS SCALE (using their official k3s/docker/helm charts app system which uses AGH's official docker images); clients are also configured correctly to use it. Normal browsing works fine. AGH is configured to do DHCP and also seem to work fine. However, for a few domains which I have configured a DNS rewrite in the web UI, as well as local hostnames that should be resolved via DHCP, lookups would fail using UDP but succeed using TCP.

In the examples below, myhost.my-domain.net is a record that I have a dns rewrite: myhost.my-domain.net -> 192.168.0.20; AGH server running on 192.168.0.20; lan is my local DHCP domain name that AGH is configured to use (e.g. myhost.lan).

Using macOS dig:

WORKED: Lookup of public domain over TCP

dig @192.168.0.20 google.com +all +keepopen +stats +retry=10 +time=10 +tcp
❯ dig @192.168.0.20 google.com +all +keepopen +stats +retry=10 +time=10 +tcp                                                                      took 61ms at 11:30:48

; <<>> DiG 9.10.6 <<>> @192.168.0.20 google.com +all +keepopen +stats +retry=10 +time=10 +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35379
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             10      IN      A       172.217.168.78

;; Query time: 80 msec
;; SERVER: 192.168.0.20#53(192.168.0.20)
;; WHEN: Thu May 30 11:31:17 CEST 2024
;; MSG SIZE  rcvd: 55

WORKED: Lookup of public domain over UDP

dig @192.168.0.20 google.com +all +keepopen +stats +retry=10 +time=10 +notcp
❯ dig @192.168.0.20 google.com +all +keepopen +stats +retry=10 +time=10 +notcp                                                                             

; <<>> DiG 9.10.6 <<>> @192.168.0.20 google.com +all +keepopen +stats +retry=10 +time=10 +notcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49823
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;google.com.                    IN      A

;; ANSWER SECTION:
google.com.             10      IN      A       172.217.168.78

;; Query time: 50 msec
;; SERVER: 192.168.0.20#53(192.168.0.20)
;; WHEN: Thu May 30 11:30:48 CEST 2024
;; MSG SIZE  rcvd: 55

WORKED: Lookup of DHCP domain over TCP

❯ dig @192.168.0.20 bzs-truenas.lan +all +keepopen +stats +retry=10 +time=10 +tcp                                                                 took 85ms at 11:31:17

; <<>> DiG 9.10.6 <<>> @192.168.0.20 myhost.lan +all +keepopen +stats +retry=10 +time=10 +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15885
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;myhost.lan.               IN      A

;; ANSWER SECTION:
myhost.lan.        3600    IN      A       192.168.0.20

;; Query time: 486 msec
;; SERVER: 192.168.0.20#53(192.168.0.20)
;; WHEN: Thu May 30 11:33:33 CEST 2024
;; MSG SIZE  rcvd: 49

FAILED: Lookup of DHCP domain over UDP

❯ dig @192.168.0.20 myhost.lan +all +keepopen +stats +retry=10 +time=10 +notcp                                                              took 496ms at 11:33:33

; <<>> DiG 9.10.6 <<>> @192.168.0.20 myhost.lan +all +keepopen +stats +retry=10 +time=10 +notcp
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

WORKED: Lookup of DNS-rewrite domain over TCP

❯ dig @192.168.0.20 myhost.my-domain.net +all +keepopen +stats +retry=10 +time=10 +tcp                                                               at 11:27:26

; <<>> DiG 9.10.6 <<>> @192.168.0.20 myhost.my-domain.net +all +keepopen +stats +retry=10 +time=10 +tcp
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 61722
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;myhost.my-domain.net.   IN      A

;; ANSWER SECTION:
myhost.my-domain.net. 3600 IN    A       192.168.0.20

;; Query time: 654 msec
;; SERVER: 192.168.0.20#53(192.168.0.20)
;; WHEN: Thu May 30 11:27:35 CEST 2024
;; MSG SIZE  rcvd: 61

FAILED: Lookup of DNS-rewrite domain over UDP

❯ dig @192.168.0.20 myhost.my-domain.net +all +keepopen +stats +retry=10 +time=10 +notcp                                             

; <<>> DiG 9.10.6 <<>> @192.168.0.20 myhost.my-domain.net +all +keepopen +stats +retry=10 +time=10 +notcp
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

Expected result

AGH responds to queries with domain names configured via DNS rewrite and internally by AGH DHCP server correctly over UDP

Actual result

AGH times out to queries with domain names configured via DNS rewrite and internally by AGH DHCP server over UDP

Additional information and/or screenshots

In AGH's query log, I can actually see the queries for myhost.my-domain.net, for both TCP and UDP queries which AGH responded with DNS rewrite and a very short response time (0.05ms v.s. several hundred ms for queries made to upstream DNS servers).

To rule out any weird behavior with macos dig, I also tried with macOS nslookup and kdig (kdig (Knot DNS), version 3.3.5) and obtained identical results as above.

Just to be sure, I checked that port 53/udp is open on the AGH server

❯ sudo nmap -sU -p53 192.168.0.20                                                                                                              took 1s493ms at 12:27:14 by bzs
Starting Nmap 7.95 ( https://nmap.org ) at 2024-05-30 12:27 CEST
Nmap scan report for myhost.lan (192.168.0.20)
Host is up (0.0094s latency).

PORT   STATE         SERVICE
53/udp open|filtered domain
Nmap done: 1 IP address (1 host up) scanned in 0.30 seconds

I made a post of macOS support forum in case there's something weird with how macOS handles DNS.

Tried another dns tool to bypass macOS DNS stack altogether, and still got some strange mixed results: If I enable the VERBOSE flag in the dns tool (which probably adds some delays) the query sometimes works over UDP, otherwise it would fail over UDP. I'm really not sure if it's something in AGH's UDP response, or the way macOS handles UDP/DNS. The latter seems unlikely, but then again, the same tool running on Linux works... I filed a bug report to Apple, but in the mean time please keep this issue open until we can definitely rule out AGH and say that it's a bug in macOS; also I'm referencing this issue in my bug report to Apple, so the information here needs to be accessible during this time.

WORKED (sometimes): macOS, dnstool, UDP

VERBOSE=1 ./dnslookup myhost.my-domain.net 192.168.0.20                 took 31ms at 13:43:13
dnslookup v1.10.1
2024/05/30 13:43:32 80207#1 [debug] dnsproxy: sending request to 192.168.0.20:53 over udp: A "myhost.my-domain.net."
2024/05/30 13:43:32 80207#1 [debug] bootstrap: dialing 192.168.0.20:53 (1/1)
2024/05/30 13:43:32 80207#1 [debug] bootstrap: connection to 192.168.0.20:53 succeeded in 291.291µs
2024/05/30 13:43:37 80207#1 [debug] dnsproxy: 192.168.0.20:53: response received over udp: "ok"
Server: 192.168.0.20

dnslookup result (elapsed 4.781519334s):
;; opcode: QUERY, status: NXDOMAIN, id: 62079
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0

;; QUESTION SECTION:
;myhost.my-domain.net.        IN       A

;; AUTHORITY SECTION:
ts.net. 300     IN      SOA     ns1.dnsimple.com. admin.dnsimple.com. 1616402063 86400 7200 604800 300

FAILED: macOS, dnstool, UDP

/dnslookup myhost.my-domain.net 192.168.0.20                                                                                       dnslookup v1.10.1
2024/05/30 13:54:48 87167#1 [debug] dnsproxy: sending request to 192.168.0.20:53 over udp: A "myhost.my-domain.net."
2024/05/30 13:54:48 87167#1 [debug] bootstrap: dialing 192.168.0.20:53 (1/1)
2024/05/30 13:54:48 87167#1 [debug] bootstrap: connection to 192.168.0.20:53 succeeded in 103.083µs
2024/05/30 13:54:58 87167#1 [debug] bootstrap: dialing 192.168.0.20:53 (1/1)
2024/05/30 13:54:58 87167#1 [debug] bootstrap: connection to 192.168.0.20:53 succeeded in 384.042µs
2024/05/30 13:55:08 87167#1 [error] dnsproxy: 192.168.0.20:53: response received over udp: "exchanging with 192.168.0.20:53 over udp: read udp 192.168.0.149:0->192.168.0.20:53:i/o timeout"
2024/05/30 13:55:08 87167#1 [fatal] Cannot make the DNS request: exchanging with 192.168.0.20:53 over udp: read udp 192.168.0.149:0->192.168.0.20:53: i/o timeout

WORKED: Linux, dnstool, UDP

root@debian:~/linux-arm64# VERBOSE=1 ./dnslookup myhost.my-domain.net 192.168.0.20
dnslookup v1.10.1
2024/05/30 19:54:14 310388#1 [debug] dnsproxy: sending request to 192.168.0.20:53 over udp: A "myhost.my-domain.net."
2024/05/30 19:54:14 310388#1 [debug] bootstrap: dialing 192.168.0.20:53 (1/1)
2024/05/30 19:54:14 310388#1 [debug] bootstrap: connection to 192.168.0.20:53 succeeded in 593.123µs
2024/05/30 19:54:14 310388#1 [debug] dnsproxy: 192.168.0.20:53: response received over udp: "ok"
Server: 192.168.0.20

dnslookup result (elapsed 5.082376ms):
;; opcode: QUERY, status: NOERROR, id: 60411
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;myhost.my-domain.net.   IN       A

;; ANSWER SECTION:
myhost.my-domain.net.    3600    IN      A       192.168.0.20
@billyzs
Copy link
Author

billyzs commented May 30, 2024

next thing to try on my end would be to maybe add a small delay in AGH's response so that it always takes at least a few hundred ms even for DNS rewrites, and see if macOS can work correctly with delay added. But I'm not familiar with AGH's code, so it would be great if someone from AGH can post a patch / point to where to add the delay.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant