Skip to content

Add cmd timeout, error detection and logging to wireguard re-resolver script#10156

Open
deajan wants to merge 1 commit intoopnsense:masterfrom
netinvent:wireguard-dns-reresolve-script-feats
Open

Add cmd timeout, error detection and logging to wireguard re-resolver script#10156
deajan wants to merge 1 commit intoopnsense:masterfrom
netinvent:wireguard-dns-reresolve-script-feats

Conversation

@deajan
Copy link
Copy Markdown

@deajan deajan commented Apr 15, 2026

Refactor DNS re-resolution script to improve logging and command execution, make it generally more reliable, add proper exit code, and make it portable for use in other projects (tested in OPNSense 26.1 and RHEL9 so far)

Important notices

Before you submit a pull request, we ask you kindly to acknowledge the following:

Describe the problem

I'm currently investigating issues with wireguard where after a WAN drop, wireguard never reconnects (handshake becomes stale) unless I restart the service on local or remote OPNsense side.

Describe the proposed solution

This PR adds stderr capture to the commands, adds a timeout in case a subprocess command gets stuck, and adds generic (rotated) logging for optional debugging purposes, catches all possible errors and logs them.
It is a first step into diagonsis of what actually happens with wireguard not reconnecting.

@AdSchellevis Please let me know if I'm out of line with this.
I would also like to add an optional "restart service" action when handshake is stale and updating resolved FQDN don't fix the issue, but that's too broad for a diagnostic right now.

Refactor DNS re-resolution script to improve logging and command execution, and make it portable for use in other projects (tested in OPNSense 26.1 and RHEL9 so far)
@AdSchellevis
Copy link
Copy Markdown
Member

@deajan just try to keep it simple, all processes can emit to syslog, if messages are missing, I have no objections to add them, but do have a desire to keep the script as simple as possible. Starting with a ticket describing the problem that is aimed to be solved is always a good starting point.

@deajan
Copy link
Copy Markdown
Author

deajan commented Apr 15, 2026

@AdSchellevis I see your point, but since I'm trying to resolve the same issues on OPNsense that I encounter on my AlmaLinux setups when using Wireguard, I wanted to make a "can use it everywhere" solution based on your work. The result is working on both platforms currently.

Do you suggest me to remove the logfile support so logs are catched from stdout by the caller (eg cron) ?
I'm willing to adapt what to what you suggest, but I honestly don't understand it right now.
Can you give me some insight ?

Apart from the logging, I wanted to make the script "error proof(TM)", meaning that I want at least to catch any possible error and log it, and of course have a one minute timemout enforced so execution cannot be frozen by some unresponsive shell command.
Is that idea set okay for you ?

@AdSchellevis
Copy link
Copy Markdown
Member

@deajan I'm ok with catching and logging errors, certainly, just want to make sure changes are small and focused

@deajan
Copy link
Copy Markdown
Author

deajan commented Apr 15, 2026

@AdSchellevis Sure, so what do you want me to remove / rework ? i can make the logging part optional, but it's convenient to log stuff and use the same script when running on different platforms.
Sorry if I'm bugging you but I genuinly don't understand what you want me to modify.

@AdSchellevis
Copy link
Copy Markdown
Member

I'm just asking for a minimal change to my script using syslog as output, looking at the request, it feels that this is about 5 to 10 lines of code, which is currently not the case. So, keep it simple, starting with the goals we try to achieve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants