-
Notifications
You must be signed in to change notification settings - Fork 182
Fix GCP Consul DNS resolution with two-layer protection #1430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add dpkg-divert to permanently block gce-resolved.conf in Packer image - Configure Consul as recursive DNS with GCE forwarder in startup script - Remove routing domains approach (Domains=~consul) which doesn't work reliably - Restart systemd-resolved after Consul starts to prevent marking DNS as unreachable
Provision script here is GPC specific already so we can merge it without IFs for different clouds. Later when merging configuration related on BYOC and multi-cloud support we can resolve it. |
Replace sleep 3 with 10x 1s wait for consul to start
| # Give Consul a moment to start its DNS server on port 8600 | ||
| echo "- Waiting for Consul DNS to start on port 8600..." | ||
| for i in {1..10}; do | ||
| if nc -z 127.0.0.1 8600 2>/dev/null; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nslookup returns non-zero if the lookup fails, could use that for a more meaningful check
| if nc -z 127.0.0.1 8600 2>/dev/null; then | |
| if ! nslookup google.com; then |
| echo "- Restarting systemd-resolved to apply Consul DNS config" | ||
| systemctl restart systemd-resolved | ||
| echo "- Waiting for systemd-resolved to settle" | ||
| sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use nslookup here as well to wait for it to successfully resolve names
Replaced sleep with actual check on systemd-resolved restart
Changes fixed ubuntu 22.04 image to latest ubuntu 22.04 lts image family, this is follow up on #1430 dns-fix.
Note: This works only for GCE deployment and does not take into account AWS / BYOC. The IP address is the same in AWS, so actually it should work, but it's just not generic enough at this moment
Note
On GCE, block gce-resolved.conf at image build and start Consul with a dynamically fetched GCE DNS recursor, reconfiguring systemd-resolved after Consul is up so Consul handles all DNS.
dpkg-divertto blockgce-resolved.conf(/etc/systemd/resolved.conf.d/gce-resolved.conf) to avoid DNS conflicts with Consul.start-client.sh):systemd-resolvedto use only127.0.0.1:8600; remove routing domains and disable GCE'sgce-resolved.conf.--recursor.systemd-resolved; add DNS readiness checks and cache flush.Written by Cursor Bugbot for commit 9fbd28e. This will update automatically on new commits. Configure here.