-
Notifications
You must be signed in to change notification settings - Fork 136
Protocols will notify when dhcpcd can exit #536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughCentralizes DHCPv4 deconfiguration, delays IPv4 address removal when releasing, tightens STOPPING/EXITING guards across handlers, updates DHCPv6 release sequencing and callbacks, adds dhcpcd_dropped notifications for per-interface teardown, and coordinates daemon exit with interface stop tracking and an exit timeout. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as CLI/Signal
participant D as dhcpcd
participant IF as Interface
participant V4 as DHCPv4
participant V6 as DHCPv6
participant LLA as IPv4LL
participant K as Kernel
participant S as Hooks
User->>D: stop_all_interfaces(opts)
D->>IF: stop_interface() (mark STOPPING)
par per-protocol
IF->>V4: dhcp_drop(reason=RELEASE?)
alt RELEASE
V4->>K: send DHCPv4 RELEASE
V4-->>V4: schedule dhcp_deconfigure(+1s)
else DROP/other
V4->>V4: dhcp_deconfigure()
end
IF->>V6: dhcp6_startrelease()
V6->>V6: prepare release (MRC=REL_MAX_RC)
V6->>K: send DHCPv6 RELEASE
K-->>V6: RELEASE ACK (DH6S_RELEASE)
V6->>V6: dhcp6_finishrelease()
V6->>D: dhcpcd_dropped(ifp)
IF->>LLA: ipv4ll_drop()
LLA->>D: dhcpcd_dropped(ifp)
end
D->>D: check dhcpcd_ifrunning()
alt none running
D->>S: run STOPPED hooks
D-->>User: eloop_exit()
else waiting
D-->>User: schedule exit-timeout
end
sequenceDiagram
autonumber
participant RX as Packet RX
participant H as Handler
participant IF as Interface
participant C as Context
RX->>H: invoke handler
Note over H: guard checks: ctx EXITING / IF STOPPING / proto disabled
alt guard tripped
H-->>RX: return (skip)
else
H->>H: normal processing
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas to focus review on:
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (8)
src/dhcpcd.h (1)
276-276: Add documentation for the new public API.The function signature and placement look good. However, consider adding a documentation comment to clarify the purpose and usage of this new API, especially since it's part of the interface lifecycle coordination mechanism described in the PR.
For example:
/** * dhcpcd_dropped - Notify dhcpcd that a protocol has finished dropping an interface * @ifp: Interface that was dropped * * Called by protocol handlers (DHCP, DHCPv6, IPv4LL) after they have completed * releasing resources and dropping an interface. This allows dhcpcd to coordinate * clean shutdown when all protocols have finished. */ void dhcpcd_dropped(struct interface *);src/ipv6nd.c (1)
1928-1932: Update the comment for the new guard.The early-return now also covers the STOPPING state, so the comment that follows no longer reflects the full condition. Please tweak the comment to capture the broader guard.
src/dhcp.c (4)
2869-2904: dhcp_deconfigure: solid teardown; add a defensive NULL-state guardLooks correct and matches deconfig semantics. To harden against unexpected races (timer fires after state freed), add a quick state NULL check before deref.
Apply this minimal guard:
static void dhcp_deconfigure(void *arg) { struct interface *ifp = arg; - struct dhcp_state *state = D_STATE(ifp); + struct dhcp_state *state = D_STATE(ifp); + if (state == NULL) { + dhcpcd_dropped(ifp); + return; + } struct if_options *ifo = ifp->options;Also confirm ipv4_applyaddr only reads state->old/new during the call (they’re freed immediately after).
2927-2927: Reason pointer: const-correctness and lifetimeEnsure
state->reasonis declared asconst char *(notchar *) and thatreasonhas static/long‑lived storage; otherwise copy/duplicate it.
2972-2972: Direct deconfigure call is fine; optional cast
dhcp_deconfiguretakesvoid *. Direct call withifpis fine in C; add(void *)ifpcast if you want to silence strict warnings in some builds.
3959-3963: Nulling if_data after free: good; consider pre‑null for safetySetting
ifp->if_data[IF_DATA_DHCP] = NULLis correct. If reentrancy is a concern, consider nulling beforefree(state)to avoid transient readers seeing a stale pointer. Low priority.src/dhcpcd.c (2)
460-486: Refactor: Use distinct variable name in loop to avoid shadowing.The loop variable
ifpon line 478 shadows the function parameterifp. While technically valid, this is confusing and makes the code harder to follow, especially since the parameter represents a specific dropped interface while the loop variable iterates all interfaces.Apply this diff to improve clarity:
- TAILQ_FOREACH(ifp, ctx->ifaces, next) { - if (dhcpcd_ifrunning(ifp)) + struct interface *iface; + TAILQ_FOREACH(iface, ctx->ifaces, next) { + if (dhcpcd_ifrunning(iface)) break; } /* All interfaces have stopped, we can exit */ - if (ifp == NULL) + if (iface == NULL) eloop_exit(ctx->eloop, EXIT_SUCCESS);
2326-2327: Consider documenting the 50-second exit timeout rationale.The 50-second timeout serves as a safeguard against hung protocols, but the specific value appears arbitrary. Consider adding a comment explaining why 50 seconds was chosen (e.g., "Allow up to 50 seconds for DHCP RELEASE acknowledgements and network transmission delays") or defining it as a named constant.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/dhcp.c(7 hunks)src/dhcp6.c(3 hunks)src/dhcpcd.c(11 hunks)src/dhcpcd.h(1 hunks)src/ipv4ll.c(2 hunks)src/ipv6nd.c(1 hunks)src/route.c(1 hunks)
🔇 Additional comments (16)
src/route.c (1)
819-820: LGTM! Clean shutdown optimization.The guard prevents
if_missfilter_applyfrom being called during shutdown, which aligns with the PR's objective of ensuring clean DHCP RELEASE packet transmission before exit. The short-circuit evaluation ensures the function is never called whenDHCPCD_EXITINGis set, avoiding unnecessary operations during teardown.src/ipv4ll.c (2)
451-451: Verify timeout deletion with NULL callback and early placement.The
eloop_timeout_deletecall usesNULLas the callback parameter, which may delete all pending timeouts for this interface. Additionally, this deletion occurs before theDHCPCD_NODROPcheck at line 454.Verify that:
- Using
NULLas the callback parameter is the intended API usage and doesn't inadvertently cancel important timeouts- Deleting timeouts before the NODROP check doesn't cause issues if the function returns early
- This approach is consistent with similar cleanup in
dhcp.canddhcp6.c
486-487: Verify cleanup is reached in all code paths.The addition of
ipv4ll_freeanddhcpcd_droppedcorrectly implements the PR objective of notifying dhcpcd when the protocol can exit. The cleanup sequence (free resources → notify dropped) is logical and consistent with the PR's design.However, ensure that:
- These cleanup calls are reached in all necessary code paths (i.e., not skipped by early returns from the NODROP check at line 454 or other conditions)
ipv4ll_freeanddhcpcd_droppedare idempotent or protected against being called multiple times ifipv4ll_dropcan be invoked repeatedlysrc/dhcp6.c (2)
3609-3613: LGTM!The DH6S_RELEASE case properly handles the REPLY acknowledgement and triggers cleanup via
dhcp6_finishrelease. The state check ensures this only processes REPLYs when expecting them.
4297-4297: Verify no duplicate notifications.The placement of
dhcpcd_droppedindhcp6_freedropis correct for notifying dhcpcd when the protocol stops. However, ensure thatdhcp6_freedrop_addrs(called at line 2118 before RELEASE) does not also calldhcpcd_dropped, as this would result in duplicate notifications and potentially incorrect exit logic in dhcpcd.src/dhcp.c (3)
3132-3137: Early guard in handler: LGTMAvoids processing when not user‑active, stopping, or DHCP disabled. This should reduce spurious handling during shutdown.
Double‑check that no necessary “final” DHCP messages are dropped under DHCPCD_STOPPING (e.g., if packets arrive during shutdown).
2916-2916: Potential overbroad timer deletion (arg-wide)
eloop_timeout_delete(…, NULL, ifp)may cancel timeouts from other protocols that also useifpas the callback arg. Prefer deleting only DHCP timeouts or use a DHCP‑scoped wrapper.Run to confirm cross‑protocol use of
ifpas arg in timeouts:
2941-2956: DHCPv4 RELEASE delay is good; refine error path and timer deletion scope
- Good: delay removal 1s after RELEASE to allow send with configured IP.
- Consider handling
send_messagefailure: on error, skip the delay and deconfigure immediately.- Same timer scope concern:
eloop_timeout_delete(…, NULL, ifp)may cancel non‑DHCP timeouts.Apply:
- send_message(ifp, DHCP_RELEASE, NULL); - eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp); + if (send_message(ifp, DHCP_RELEASE, NULL) == -1) { + dhcp_deconfigure(ifp); + return; + } + /* Delete only DHCP timers here if possible. */ + eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp);If you keep arg‑wide deletion, please verify other protocols (e.g., DHCPv6, IPv4LL, IPv6ND) don’t pass
ifpas the timeout arg to avoid unintended cancellations (see script above).src/dhcpcd.c (8)
441-458: LGTM! Helper function correctly checks protocol states.The
dhcpcd_ifrunningfunction properly checks if any DHCP protocol (DHCPv4, IPv4LL, or DHCPv6) is still active on the interface. The conditional compilation guards ensure only available protocols are checked.
488-495: LGTM! Function signature change aligns with new notification pattern.Converting
stop_interfacetostatic voidis appropriate since the exit decision now depends on protocol notifications viadhcpcd_droppedrather than immediate return values.
780-781: LGTM! Early return prevents interference during shutdown.Skipping carrier change processing when the interface is stopping prevents race conditions and ensures clean shutdown.
1109-1109: LGTM! Proper shutdown flow for departed interfaces.Using
stop_interfacefor departed interfaces and the early return when exiting ensures consistent shutdown behavior and prevents processing new interface events during teardown.Also applies to: 1116-1117
1180-1181: LGTM! Prevents link event processing during shutdown.The early return when exiting is correct and consistent with similar guards in other event handlers.
1423-1431: LGTM! Return value enables proper wait-vs-exit decision.The boolean return value correctly indicates whether any interface was actually stopped, allowing callers to determine if they should wait for protocol notifications or exit immediately. This is a key part of the graceful shutdown mechanism described in the PR objectives.
Also applies to: 1444-1447
1563-1572: LGTM! Signal handler correctly implements notification-based exit.The changes properly:
- Prevent re-entrant signal handling during teardown
- Defer exit when interfaces are actively stopping (waits for
dhcpcd_droppednotifications)- Exit immediately in test mode or when no interfaces were stopped
This aligns with the PR objective of ensuring RELEASE packets are fully transmitted before exit.
1714-1718: LGTM! Consistent exit handling in command-line args path.The logic matches the signal handler approach, ensuring that explicit exit/release commands also wait for protocol notifications when interfaces are actively stopping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/dhcp6.c (1)
2128-2133: Inconsistent brace style in if-else statement.The
ifblock uses braces but theelsebranch does not. For consistency and readability, consider using braces on both branches.Apply this diff to add braces:
if (dhcp6_makemessage(ifp) == -1) { logerr("%s: %s", __func__, ifp->name); /* not much we can do apart from finish now */ dhcp6_finishrelease(ifp); - } else + } else { dhcp6_sendrelease(ifp); + }src/dhcpcd.c (2)
478-485: Variable shadowing in loop.The loop at line 478 reuses the parameter variable name
ifp, which shadows the function parameter. While functionally correct (since the parameter is no longer needed after line 473), this could be confusing for maintainability.Consider using a different variable name for the loop:
- TAILQ_FOREACH(ifp, ctx->ifaces, next) { - if (dhcpcd_ifrunning(ifp)) + struct interface *ifp_iter; + TAILQ_FOREACH(ifp_iter, ctx->ifaces, next) { + if (dhcpcd_ifrunning(ifp_iter)) break; } /* All interfaces have stopped, we can exit */ - if (ifp == NULL) + if (ifp_iter == NULL) eloop_exit(ctx->eloop, EXIT_SUCCESS);
2330-2331: Consider documenting the 50-second timeout rationale.The timeout value of 50 seconds appears to be chosen to allow sufficient time for the signaled process to exit. However, the choice of this specific value is not documented.
Consider adding a comment explaining the timeout duration, or defining it as a named constant:
+ /* Allow up to 50 seconds for process to exit gracefully */ eloop_timeout_add_sec(ctx->eloop, 50, dhcpcd_exit_timeout, &ctx);Or:
#define EXIT_WAIT_TIMEOUT 50 /* seconds to wait for process exit */ ... eloop_timeout_add_sec(ctx->eloop, EXIT_WAIT_TIMEOUT, dhcpcd_exit_timeout, &ctx);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/dhcp6.c(3 hunks)src/dhcpcd.c(11 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/dhcp6.c (1)
src/dhcpcd.c (1)
dhcpcd_dropped(460-486)
src/dhcpcd.c (4)
src/script.c (1)
script_runreason(745-802)src/eloop.c (1)
eloop_exit(652-660)src/if.c (1)
if_free(81-104)compat/pidfile.c (1)
pidfile_read(135-167)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: netbsd (--disable-ipv6, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6)
- GitHub Check: netbsd (--disable-dhcp6)
- GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
- GitHub Check: netbsd (--disable-arp, -DSMALL)
- GitHub Check: netbsd (--disable-arp)
- GitHub Check: netbsd (-DSMALL)
- GitHub Check: netbsd (--disable-ipv4ll)
- GitHub Check: netbsd
- GitHub Check: netbsd (--disable-ipv4)
- GitHub Check: netbsd (--disable-ipv4, -DSMALL)
- GitHub Check: freebsd
- GitHub Check: openbsd
🔇 Additional comments (11)
src/dhcp6.c (2)
3611-3615: LGTM! Proper RELEASE acknowledgment handling.The code correctly handles DHCPv6 RELEASE acknowledgment by logging and calling
dhcp6_finishrelease, which aligns with the PR objective to wait for acknowledgment before completing the release sequence.
4299-4299: LGTM! Proper integration with exit coordination.The call to
dhcpcd_droppedcorrectly notifies the core daemon that DHCPv6 has finished cleanup, enabling coordinated shutdown as described in the PR objectives.src/dhcpcd.c (9)
441-458: LGTM! Clear interface state check.The function correctly checks whether any protocol (DHCP, IPv4LL, DHCPv6) is still running on an interface, which is essential for coordinating shutdown.
489-495: LGTM! Simplified stop interface logic.The refactored function has a clearer single responsibility: mark the interface as stopping and drop protocols. The exit coordination is now properly handled by
dhcpcd_dropped.
780-781: LGTM! Proper shutdown guard.The early return when
DHCPCD_STOPPINGis set prevents unnecessary carrier state changes during shutdown, avoiding potential re-entry issues.
1116-1117: LGTM! Proper shutdown guard.The early return when
DHCPCD_EXITINGis set prevents interface discovery during shutdown, which could interfere with the orderly exit process.
1180-1181: LGTM! Proper shutdown guard.The early return when
DHCPCD_EXITINGis set prevents link event processing during shutdown, avoiding unnecessary work and potential state inconsistencies.
1423-1448: LGTM! Improved exit coordination.The boolean return value correctly indicates whether any protocols were stopped, which is used by callers to determine whether to wait for completion or exit immediately. This is a key part of the coordinated shutdown mechanism.
1563-1574: LGTM! Proper signal handler coordination.The modified logic correctly coordinates exit by:
- Preventing re-entry with the
DHCPCD_EXITINGguard- Deferring exit if protocols were stopped (waiting for
dhcpcd_droppedto be called)- Exiting immediately if nothing was running
This aligns with the PR objective to ensure protocols complete their shutdown sequences.
1714-1718: LGTM! Consistent control command handling.The exit coordination for control commands correctly uses the same pattern as the signal handler, ensuring consistent behavior whether shutdown is triggered by signal or control command.
1949-1969: LGTM! Improved pidfile handling addresses past review.The code now properly handles
pidfile_readreturning -1 in both timeout functions:
dhcpcd_pidfile_timeout: Exits with SUCCESS if process gone, continues polling otherwisedhcpcd_exit_timeout: Exits with SUCCESS if process gone, logs warning and fails if still running after timeoutThis resolves the past review concern about the confusing "pid -1 failed to exit" log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/dhcp.c (1)
2949-2963: Delayed address removal for DHCPv4 RELEASE aligns with PR objectives.The 1-second delay before calling
dhcp_deconfigureallows the RELEASE packet to be transmitted before the address is removed. This addresses issue #513 by ensuring the packet is fully formed and sent before the interface is torn down.However, consider these refinements:
- Replace magic number with named constant:
+#define DHCP_RELEASE_DELAY_SEC 1 + static void dhcp_drop(struct interface *ifp, const char *reason) { ... - const struct timespec delay = { - .tv_sec = 1, - }; + const struct timespec delay = { + .tv_sec = DHCP_RELEASE_DELAY_SEC, + };
- Expand the comment to explain the rationale:
- /* We need to delay removal of the IP address so the - * message can be sent. - * Unlike DHCPv6, there is no acknowledgement. */ + /* DHCPv4 RELEASE has no acknowledgement from the server. + * Delay address removal to ensure the RELEASE packet is + * transmitted before the interface/address is torn down. + * See issue #513. */
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
hooks/dhcpcd-run-hooks.8.in(2 hunks)src/dhcp.c(7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/dhcp.c (4)
src/auth.c (1)
dhcp_auth_reset(78-95)src/ipv4.c (1)
ipv4_applyaddr(746-833)src/script.c (1)
script_runreason(745-802)src/dhcpcd.c (1)
dhcpcd_dropped(460-488)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
- GitHub Check: netbsd (--disable-dhcp6)
- GitHub Check: netbsd (--disable-ipv6, -DSMALL)
- GitHub Check: netbsd (--disable-arp, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6)
- GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4)
- GitHub Check: netbsd (--disable-arp)
- GitHub Check: netbsd (--disable-ipv4ll)
- GitHub Check: netbsd (--disable-ipv4, -DSMALL)
- GitHub Check: netbsd (-DSMALL)
- GitHub Check: netbsd
- GitHub Check: freebsd
- GitHub Check: openbsd
🔇 Additional comments (8)
hooks/dhcpcd-run-hooks.8.in (2)
25-25: LGTM!The man page date update is appropriate for documenting the changes in this PR.
108-109: Well-documented new hook reason.The
RELEASE | RELEASE6entry accurately documents the new hook reason for when dhcpcd releases a lease. The format, placement, and description are consistent with existing entries and align perfectly with the PR objectives to properly handle DHCP RELEASE packets.src/dhcp.c (6)
2869-2912: Well-structured centralization of deconfiguration logic.This new function consolidates cleanup sequences that were previously scattered, improving maintainability. The logic correctly handles:
- Authentication state reset
- State machine transition to DHS_NONE
- Resource lifecycle (offer/old/new pointers)
- Address application/removal via
ipv4_applyaddr- Script execution with appropriate reason
- Coordination with core via
dhcpcd_droppedwhen stoppingThe conditional flow based on
DHCPCD_STOPPINGensures proper shutdown sequencing.
2924-2924: Good addition of notification for early-exit path.Calling
dhcpcd_dropped(ifp)when the state is absent or inactive ensures the core is notified that the interface has stopped, maintaining proper coordination with the shutdown sequence described in the PR objectives.
2960-2960: Verify the scope of timeout deletion is intentional.The call
eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp)with a NULL callback pattern deletes all pending timeouts for this interface. This is aggressive and will cancel any scheduled operations (discover, renew, rebind, etc.).While this is likely correct for a clean RELEASE shutdown, please confirm:
- This behavior is intentional and necessary to prevent interference with the release sequence
- No other operations need to complete before the delayed
dhcp_deconfigureexecutesIf this is intended, consider adding a comment explaining why all timeouts are cleared:
/* Clear all pending DHCP operations (discover, renew, etc.) to ensure * clean release shutdown without interference. */ eloop_timeout_delete(ifp->ctx->eloop, NULL, ifp);
2980-2980: LGTM: Centralized cleanup call.Replacing inline cleanup with the new
dhcp_deconfigurefunction improves maintainability and consistency.
3140-3144: Essential guard prevents processing during shutdown.This early-return check prevents DHCP message handling when:
- The interface is not user-active
- The STOPPING flag is set (shutdown in progress)
- DHCP is not enabled for the interface
This is critical for clean shutdown coordination and prevents race conditions where DHCP responses arrive after the stop sequence has begun, addressing the core PR objective that "protocols notify when dhcpcd can exit."
3970-3970: Good practice: Clear pointer after free.Setting
ifp->if_data[IF_DATA_DHCP] = NULLafter freeing the DHCP state prevents use-after-free bugs and makes debugging easier by ensuring the pointer doesn't dangle. This follows defensive programming best practices.
|
I will download this patch and give it a try. |
|
I think it still appears to garble outgoing DHCPv6 Release6 message, which appears to be getting cut off. |
|
That packet was captured from the server side. |
|
I have download you patch and give it a try on the dhcpcd version 10.2.2 baseline, Thread 1 (LWP 8014): (gdb) See was_link_up was false. It seems we can get some network event even for interface which is not up. What is a bit weird to me? So, after I applied your patch, see attached. protocols_will_notify_dhcpcd_can_exit.txt Just moved DHCPCD_STOPPING check below the carrier-down handling block, after basic interface validity has been verified. |
It looks like dhcpcd had two commands to stop. How are you stopping dhcpcd? |
Thanks for that. I committed a slightly different patch as we still need to return early I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/dhcpcd.c (1)
460-488: Consider using a separate loop variable to avoid shadowing the parameter.The function correctly coordinates interface teardown, but the
TAILQ_FOREACHat line 480 reuses theifpparameter as the loop iterator. While this pattern is common and works here (the parameter is no longer needed after line 475), it's error-prone and can confuse readers.Apply this diff to use a separate loop variable:
void dhcpcd_dropped(struct interface *ifp) { struct dhcpcd_ctx *ctx = ifp->ctx; + struct interface *ifp1; if (ifp->options == NULL || !(ifp->options->options & DHCPCD_STOPPING) || dhcpcd_ifrunning(ifp)) return; /* De-activate the interface */ if (ifp->active) { ifp->active = IF_INACTIVE; ifp->options->options &= ~DHCPCD_STOPPING; script_runreason(ifp, "STOPPED"); } if (!(ctx->options & DHCPCD_EXITING)) return; - TAILQ_FOREACH(ifp, ctx->ifaces, next) { - if (dhcpcd_ifrunning(ifp)) + TAILQ_FOREACH(ifp1, ctx->ifaces, next) { + if (dhcpcd_ifrunning(ifp1)) break; } /* All interfaces have stopped, we can exit */ - if (ifp == NULL) + if (ifp1 == NULL) eloop_exit(ctx->eloop, EXIT_SUCCESS); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/dhcpcd.c(11 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/dhcpcd.c (4)
src/script.c (1)
script_runreason(745-802)src/eloop.c (1)
eloop_exit(652-660)src/if.c (1)
if_free(81-104)compat/pidfile.c (1)
pidfile_read(135-167)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: netbsd (--disable-ipv6, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4, -DSMALL)
- GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
- GitHub Check: netbsd (--disable-arp)
- GitHub Check: netbsd (--disable-arp, -DSMALL)
- GitHub Check: netbsd (--disable-dhcp6)
- GitHub Check: netbsd (--disable-ipv4ll)
- GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6)
- GitHub Check: netbsd (--disable-ipv4)
- GitHub Check: netbsd
- GitHub Check: netbsd (-DSMALL)
- GitHub Check: freebsd
- GitHub Check: openbsd
🔇 Additional comments (11)
src/dhcpcd.c (11)
441-458: LGTM!The
dhcpcd_ifrunninghelper correctly checks whether any protocol (DHCPv4, IPv4LL, DHCPv6) is still active on the interface. The logic is straightforward and aligns with the PR's goal of tracking running protocols before allowing exit.
490-497: LGTM!The refactored
stop_interfacecorrectly sets theSTOPPINGflag before callingdhcpcd_drop, establishing the contract thatdhcpcd_droppedlater checks to determine when teardown is complete.
782-791: LGTM!The guard correctly prevents carrier events from interfering during shutdown. The dual check (interface options vs. context) handles both active and inactive interfaces appropriately, and the early return avoids unnecessary protocol churn during teardown.
1126-1127: LGTM!The guard correctly prevents new interface events from being processed during exit, avoiding races where an interface addition could interfere with the shutdown sequence.
1190-1191: LGTM!The guard prevents link state changes from triggering protocol actions during exit, maintaining consistency with the shutdown guards in other handlers.
1433-1458: LGTM with observation.The refactored function correctly:
- Returns a bool indicating whether any interfaces were stopped (enabling callers to decide whether to wait for completion)
- Processes interfaces in reverse order for proper teardown sequencing
- Sets both
STOPPINGandEXITINGflagsNote that line 1453 sets
EXITINGon interface options in addition to the context-level flag (line 1439). This duplication might be intentional for per-interface tracking, but consider whether both are necessary.
1573-1582: LGTM!The modified signal handler correctly:
- Guards against re-processing SIGTERM/SIGINT during teardown (preventing memory issues as noted in the comment)
- Uses the return value from
stop_all_interfacesto determine whether to wait for protocol completion or exit immediately- Defers the exit to
dhcpcd_droppedwhen interfaces are still runningThis is the core coordination logic that addresses the PR objectives.
1724-1728: LGTM!The control command handler correctly mirrors the signal handler's coordination pattern, deferring exit until protocols complete when interfaces are actually stopped.
1959-1963: LGTM!The modification correctly implements polling behavior, rescheduling the timeout check every 100ms until the pidfile indicates the process has exited.
2340-2341: LGTM!The 50-second backstop timeout prevents indefinite hanging when waiting for another dhcpcd process to exit, providing a reasonable escape hatch.
1966-1979: Past review comment not fully addressed.A previous review (marked as "✅ Addressed in commit cd317bf") flagged that if
pidfile_readreturns -1, logging "pid -1 failed to exit" is misleading (lines 1972-1976). However, the current code does not explicitly handle the -1 case before logging. The same check exists indhcpcd_pidfile_timeoutat line 1959, but is missing here.Apply this diff to handle the error case as previously suggested:
static void dhcpcd_exit_timeout(void *arg) { struct dhcpcd_ctx *ctx = arg; pid_t pid; pid = pidfile_read(ctx->pidfile); - logwarnx("pid %lld failed to exit", (long long)pid); - eloop_exit(ctx->eloop, EXIT_FAILURE); + if (pid == -1) + eloop_exit(ctx->eloop, EXIT_SUCCESS); + else { + logwarnx("pid %lld failed to exit", (long long)pid); + eloop_exit(ctx->eloop, EXIT_FAILURE); + } }Likely an incorrect or invalid review comment.
|
With my patch no crashes. But based on pmd in offline gdb I think checking ifp->ctx->options can crash again, see below value is something wild: (gdb) p *ifp->ctx maybe just to do: void for inactive interface we will have ifp->options == NULL and skip the above check safely. we will have was_link_up = false and we will return |
Same as in #513 On system reboot, it stops those services in reverse order, sending systemctl stop to [email protected], then stop to dhcpcd.service Both service files in my testing have --release (@.service includes the interface). My current workaround is to omit the ExecStop from [email protected], relying on ExecReload instead, and only doing a --release in the main dhcpcd.service. And that seems to work ok. If that's a valid workaround, I'm ok with moving forward marking this as a fix for the other issues. |
|
any comment on dhcpcd_handlecarrier() ifp->options->options check? |
|
After we delivered your : Protocols will notify when dhcpcd can exit patch with that additional updated in dhcpcd_handlecarrier() we entered through dhcp_start1(), which probably was called from dhcp_start() This is pmd backtrace Program terminated with signal SIGSEGV, Segmentation fault. (gdb) bt full (gdb) thread apply all bt Thread 1 (LWP 7836): surely problem in dhcp_discover() is access to null pointer: state->state void from offline debugger above we can see: (gdb) p *ifp (gdb) p *ifp->options->options so, we can not use checking of DHCPCD_STOPPING at this point. quick fix would be sanity check, like: void But could we fix this somewhere earlier? |
|
As we need to resolve these segmentation crashes on dhcpcd stop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
hooks/dhcpcd-run-hooks.8.in(2 hunks)src/dhcp.c(7 hunks)src/dhcp6.c(3 hunks)src/dhcpcd.c(11 hunks)src/dhcpcd.h(1 hunks)src/ipv4ll.c(2 hunks)src/ipv6nd.c(1 hunks)src/route.c(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- hooks/dhcpcd-run-hooks.8.in
- src/dhcpcd.h
- src/route.c
- src/dhcp.c
- src/ipv6nd.c
🧰 Additional context used
🧬 Code graph analysis (3)
src/dhcpcd.c (4)
src/script.c (1)
script_runreason(745-802)src/eloop.c (1)
eloop_exit(556-561)src/if.c (1)
if_free(81-104)compat/pidfile.c (1)
pidfile_read(135-167)
src/dhcp6.c (1)
src/dhcpcd.c (1)
dhcpcd_dropped(460-488)
src/ipv4ll.c (1)
src/dhcpcd.c (1)
dhcpcd_dropped(460-488)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: freebsd
- GitHub Check: netbsd (--disable-arp)
- GitHub Check: netbsd (--disable-dhcp6)
- GitHub Check: netbsd (--disable-ipv4ll)
- GitHub Check: netbsd (--disable-arp, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4)
- GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6)
- GitHub Check: netbsd (--disable-ipv4, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
- GitHub Check: netbsd
- GitHub Check: netbsd (-DSMALL)
- GitHub Check: openbsd
DHCPv6 RELEASE requires the addresses to be dropped before a RELEASE message is sent. We now wait for an acknowledgement or a timeout before notifying that DHCPv6 has stopped for the interface. DHCPv4 RELEASE is the other way around, there is no acknowledgement. So we wait for 1 second after sending the message before removing the address and notifying DHCP has stopped for the interface. If we are not releasing then we notify dhcpcd that the protocol has stopped right away when we drop the lease. dhcpcd will exit once there are no running protocols for the interfaces. Fixes #513. Hopefully #535, #519 and #509 as well.
If allows an interface to stop even if dhcpcd is not.
While here address some hangup issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
src/dhcp6.c (1)
2113-2132: DHCPv6 RELEASE should drop addresses before sending (but keep IA list for the message)To meet the PR objective and RFC 8415 intent, remove the IPv6 addresses from the OS before the RELEASE is transmitted, while still including IA/IAADDR options in the RELEASE. Safe point: after
dhcp6_makemessage()succeeds, beforedhcp6_sendrelease().Also ensure we never clear the in‑memory IA list before the OS drop, otherwise kernel addresses remain configured (a previously reported pitfall).
Apply this change:
static void dhcp6_startrelease(struct interface *ifp) { struct dhcp6_state *state; state = D6_STATE(ifp); if (state->state != DH6S_BOUND) { dhcp6_finishrelease(ifp); return; } state->state = DH6S_RELEASE; state->RTC = 0; state->IMD = REL_MAX_DELAY; state->IRT = REL_TIMEOUT; state->MRT = REL_MAX_RT; state->MRC = REL_MAX_RC; state->MRCcallback = dhcp6_finishrelease; - if (dhcp6_makemessage(ifp) == -1) { + if (dhcp6_makemessage(ifp) == -1) { logerr("%s: %s", __func__, ifp->name); /* not much we can do apart from finish now */ dhcp6_finishrelease(ifp); - } else - dhcp6_sendrelease(ifp); + } else { + /* Drop non-delegated addresses from the OS before sending RELEASE. */ + dhcp6_freedrop_addrs(ifp, 1, IPV6_AF_DELEGATED, NULL); + dhcp6_sendrelease(ifp); + } }Additionally, double‑check no other paths clear the IA list without dropping kernel addrs. One known spot is the lease‑read failure path; consider flipping the “drop” flag there as well:
- dhcp6_freedrop_addrs(ifp, 0, IPV6_AF_DELEGATED, NULL); + dhcp6_freedrop_addrs(ifp, 1, IPV6_AF_DELEGATED, NULL);(Located in dhcp6_readlease() error path.)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
hooks/dhcpcd-run-hooks.8.in(2 hunks)src/dhcp.c(7 hunks)src/dhcp6.c(3 hunks)src/dhcpcd.c(10 hunks)src/dhcpcd.h(1 hunks)src/ipv4ll.c(2 hunks)src/ipv6nd.c(1 hunks)src/privsep.c(2 hunks)src/route.c(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/dhcp.c
🧰 Additional context used
🧬 Code graph analysis (5)
src/ipv4ll.c (1)
src/dhcpcd.c (1)
dhcpcd_dropped(460-488)
src/dhcp6.c (1)
src/dhcpcd.c (1)
dhcpcd_dropped(460-488)
src/dhcpcd.h (1)
src/dhcpcd.c (1)
dhcpcd_dropped(460-488)
src/route.c (1)
src/if-bsd.c (1)
if_missfilter_apply(1638-1660)
src/dhcpcd.c (4)
src/script.c (1)
script_runreason(745-802)src/eloop.c (1)
eloop_exit(556-561)src/if.c (1)
if_free(81-104)compat/pidfile.c (1)
pidfile_read(135-167)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: netbsd (--disable-dhcp6)
- GitHub Check: netbsd (--disable-ipv6)
- GitHub Check: netbsd (--disable-ipv4ll)
- GitHub Check: netbsd (--disable-ipv4)
- GitHub Check: netbsd (--disable-ipv6, -DSMALL)
- GitHub Check: netbsd
- GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
- GitHub Check: netbsd (--disable-arp, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
- GitHub Check: netbsd (--disable-arp)
- GitHub Check: netbsd (-DSMALL)
- GitHub Check: freebsd
- GitHub Check: openbsd
🔇 Additional comments (12)
src/dhcpcd.c (2)
782-791: Guarding STOPPING carrier eventsThe early return for STOPPING interfaces avoids resurrecting link-state work while teardown is in flight, matching the crash reports.
1965-1976: Appreciate the explicit exit timeoutThe pidfile-based timeout keeps the CLI wait path bounded instead of hanging forever if the manager never exits.
src/dhcpcd.h (1)
270-270: LGTM: new public API declaration matches implementation
void dhcpcd_dropped(struct interface *);aligns with the implementation and usage across modules. No issues.src/route.c (1)
827-829: Good guard to avoid route miss-filter churn during exitSkipping
if_missfilter_apply()whenDHCPCD_EXITINGis set reduces shutdown races. Looks correct.src/privsep.c (2)
1139-1143: Hangup triggers clean stop pathTreating
ELE_HANGUPas zero-length read and reusing the stop path is sound; avoids duplicated cleanup.
1169-1177: Unified stop path sets EXITING and exits eloopThe consolidated
stop:path properly setsDHCPCD_EXITING, frees processes, and exits with success on orderly hangup. Looks good.hooks/dhcpcd-run-hooks.8.in (2)
25-25: Manpage date bumpDate update is fine.
108-110: Documented RELEASE | RELEASE6 hook reasonsClear and consistent with other reason entries.
src/ipv6nd.c (1)
1929-1932: Avoid RA processing while stoppingEarly-returning when
DHCPCD_STOPPINGis set prevents teardown races. Looks correct.src/ipv4ll.c (1)
451-451: Good: cancel pending timeouts for this interfaceDeleting timeouts keyed by
ifpavoids callbacks during teardown.src/dhcp6.c (2)
3609-3613: REPLY handling for RELEASE ack is correctLogging and invoking
dhcp6_finishrelease(ifp)on RELEASE acknowledgement aligns with the new sequencing.
4297-4297: Notify manager after DHCPv6 teardownCalling
dhcpcd_dropped(ifp)here is the right hook to unblock daemon exit once all protocols stop.
| dhcpcd_dropped(struct interface *ifp) | ||
| { | ||
| struct dhcpcd_ctx *ctx = ifp->ctx; | ||
|
|
||
| /* Delete all timeouts for the interfaces */ | ||
| eloop_q_timeout_delete(ctx->eloop, ELOOP_QUEUE_ALL, NULL, ifp); | ||
| if (ifp->options == NULL || | ||
| !(ifp->options->options & DHCPCD_STOPPING) || | ||
| dhcpcd_ifrunning(ifp)) | ||
| return; | ||
|
|
||
| /* De-activate the interface */ | ||
| ifp->active = IF_INACTIVE; | ||
| ifp->options->options &= ~DHCPCD_STOPPING; | ||
| if (ifp->active) { | ||
| ifp->active = IF_INACTIVE; | ||
| ifp->options->options &= ~DHCPCD_STOPPING; | ||
| script_runreason(ifp, "STOPPED"); | ||
| } | ||
|
|
||
| if (!(ctx->options & (DHCPCD_MANAGER | DHCPCD_TEST))) | ||
| eloop_exit(ctx->eloop, EXIT_FAILURE); | ||
| if (!(ctx->options & DHCPCD_EXITING)) | ||
| return; | ||
|
|
||
| TAILQ_FOREACH(ifp, ctx->ifaces, next) { | ||
| if (dhcpcd_ifrunning(ifp)) | ||
| break; | ||
| } | ||
|
|
||
| /* All interfaces have stopped, we can exit */ | ||
| if (ifp == NULL) | ||
| eloop_exit(ctx->eloop, EXIT_SUCCESS); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve deferred exit status
When SIGINT (and any non-success signal) arrives we set exit_code = EXIT_FAILURE, but once stop_all_interfaces finds active work we return from the handler and rely on dhcpcd_dropped to terminate. This helper now unconditionally calls eloop_exit(..., EXIT_SUCCESS), so a SIGINT ends up exiting with status 0 whenever an interface was active, breaking the documented signal semantics for callers and automation. Please stash the intended exit code (e.g., a new ctx->pending_exit_code) before returning from the signal handler and reuse it here instead of hard-coding EXIT_SUCCESS.
🤖 Prompt for AI Agents
In src/dhcpcd.c around lines 461-488, dhcpcd_dropped currently calls
eloop_exit(ctx->eloop, EXIT_SUCCESS) unconditionally which ignores any deferred
non-success signal exit status; change it to use a pending exit code stored on
the context (e.g., ctx->pending_exit_code) so the final exit honors the
signal-set value: ensure the signal handler sets ctx->pending_exit_code to
EXIT_FAILURE (or the appropriate non-success value) before returning, initialize
ctx->pending_exit_code to EXIT_SUCCESS where ctx is created, and replace the
hard-coded EXIT_SUCCESS in the eloop_exit call with ctx->pending_exit_code so
deferred exits return the stashed status.
|
I see you delivered this patch: as pasted before, I think this will crash as ctx is wild. I did just: --------------------------------- src/dhcpcd.c ---------------------------------
Also we need to guard in dhcp_discover(void *arg) as state was null. But I will try your additional patches anyway. |
That is not possible as ctx is never removed from the ifp and it's set as the first thing after calloc. |
|
I tested release branch with my test case (release) and it no longer garbles the DHCPv6 Release packet. it is successfully received by the server, as is the DHCPv4 release. |
|
From the logs, there's still some issues with the exit. |
That looks like an issue with epoll(7) which we recently moved to. It might only work if we are looking at fd's - in this case we are not and just using it for a timeout which ppoll(2) supports. |
As it more matches ppoll semantics. Note that epoll_pwait does NOT work with zero events, so use ppoll in this case.
|
@ColinMcInnes your issue is now solved. But I have a new one I can replicate when dhcpcd exits - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/eloop.c(2 hunks)src/privsep-linux.c(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: netbsd (--disable-dhcp6)
- GitHub Check: netbsd (--disable-ipv4ll, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6, -DSMALL)
- GitHub Check: netbsd (--disable-dhcp6, -DSMALL)
- GitHub Check: netbsd (--disable-ipv4ll)
- GitHub Check: netbsd (--disable-ipv4, -DSMALL)
- GitHub Check: netbsd (--disable-ipv6)
- GitHub Check: netbsd (--disable-ipv4)
- GitHub Check: netbsd (--disable-arp)
- GitHub Check: netbsd (--disable-arp, -DSMALL)
- GitHub Check: netbsd (-DSMALL)
- GitHub Check: netbsd
- GitHub Check: freebsd
- GitHub Check: openbsd
🔇 Additional comments (2)
src/privsep-linux.c (1)
328-330: LGTM!The addition of
__NR_epoll_pwait2to the seccomp filter properly mirrors the existing__NR_epoll_pwaitpattern and aligns with the new epoll_pwait2 usage introduced in src/eloop.c.src/eloop.c (1)
938-964: Timeout conversion math is mathematically sound and handles edge cases correctly.The overflow detection logic properly checks both components: if seconds exceed INT_MAX/1000 or if the combined millisecond value would overflow. The rounding formula
(ts->tv_nsec + 999999) / 1000000correctly rounds nanoseconds up to milliseconds. The code also properly handles the NULL timeout case (returning -1).

DHCPv6 RELEASE requires the addresses to be dropped before a RELEASE message is sent.
We now wait for an acknowledgement or a timeout before notifying that DHCPv6 has stopped for the interface.
DHCPv4 RELEASE is the other way around, there is no acknowledgement. So we wait for 1 second after sending the message before removing the address and notifying DHCP has stopped for the interface.
If we are not releasing then we notify dhcpcd that the protocol has stopped right away when we drop the lease.
dhcpcd will exit once there are no running protocols for the interfaces.
Fixes #513.
Hopefully #535, #519 and #509 as well.