Skip to content

configurable radius max retries and cascade-on-timeout suppression#10271

Closed
fmotzet wants to merge 1 commit into
opnsense:masterfrom
fmotzet:auth-radius-mfa-retries-and-cascade
Closed

configurable radius max retries and cascade-on-timeout suppression#10271
fmotzet wants to merge 1 commit into
opnsense:masterfrom
fmotzet:auth-radius-mfa-retries-and-cascade

Conversation

@fmotzet
Copy link
Copy Markdown

@fmotzet fmotzet commented May 7, 2026

Important notices

Before you submit a pull request, we ask you kindly to acknowledge the following:

If AI was used, please disclose:

  • Model used: Claude Opus 4.7
  • Extent of AI involvement: Code review, test. Also all changes were tested before submitting by me.

Describe the problem

OpenVPN authentication via RADIUS to an interactive-2FA backend (e.g. NPS with the NPS Extension for Microsoft Entra MFA) is unreliable due to a hardcoded retry count and an unconditional authmode-cascade on timeout. See #10270 for the full reproduction.


Describe the proposed solution

  1. Configurable radius_max_retries. Plumb the existing libradius max_tries parameter through OPNsense\Auth\Radius::setProperties() and add a UI field next to "Authentication Timeout" under System → Access → Servers. The class default of 3 is preserved when the field is empty, so existing installs are unchanged. Users with interactive 2FA backends can raise it (e.g. 20) to extend the per-attempt approval window without touching any other knob.
  2. Suppress authmode cascade on RADIUS timeout. Record the outcome of each authenticate() call on the Radius class as one of accept, reject, timeout, or error, exposed via a new getLastResult() method. In user_pass_verify.php, after a failed authenticate(), break out of the authmode foreach if the previous source returned timeout. Explicit Access-Reject still cascades, preserving multi-source flows such as a Local-then-RADIUS fall-through, or routing users that exist in NPS2 but not NPS1.

The cascade-suppression check uses method_exists() against the authenticator, so non-Radius authenticators behaviour is unchanged.

Validation against the live setup described in the issue: with radius_max_retries=20, an end-to-end MFA approval that previously failed at ~9 seconds now succeeds at any point within the configured budget; with the cascade-suppression in place a no-tap negative test produces 20 retransmits to the primary NPS, zero traffic to the secondary, and a single-line auth source '...' did not respond ... skipping remaining auth sources warning in syslog.

Compared to the log in the related Issue, only sends request to the IP of one NPS Server. Still timeouts after reaching radius_max_retries=20

14:49:47.641163 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:49:50.673553 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:49:53.827342 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:49:56.935630 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:00.068626 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:03.088720 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:06.127568 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:09.310343 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:12.498892 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:15.671278 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:18.731801 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:21.791243 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:24.815799 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:27.849564 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:30.913266 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:34.001625 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:37.100363 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:40.107626 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:43.128629 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:50:46.208645 IP 10.20.30.12.64201 > 10.20.30.31.1812: RADIUS, Access-Request (1), id: 0x7f length: 120
14:51:29.350378 IP 10.20.30.31.1812 > 10.20.30.12.64201: RADIUS, Access-Reject (3), id: 0x7f length: 38

Related issue

#10270

$pconfig['radius_acct_port'] = $a_server[$id]['radius_acct_port'] ?? '';
$pconfig['radius_secret'] = $a_server[$id]['radius_secret'] ?? '';
$pconfig['radius_timeout'] = $a_server[$id]['radius_timeout'] ?? '';
$pconfig['radius_max_retries'] = $a_server[$id]['radius_max_retries'] ?? '';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we better move these to the Radius class, same as 95483e5

}
return true;
}
// only cascade on a decisive answer; an upstream that didn't
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's focus on retries and don't try to complicate the flow further than needed, if you only need a single upstream authenticator, just configure a single one.

@AdSchellevis
Copy link
Copy Markdown
Member

#10270 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants