api: replace clock_t with high-res type#623
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR replaces microsecond-based timing with a nanosecond clock API (gr_clock_ns_t, GR_NS_PER_S, gr_clock_ns()) and updates all uses and stored timestamps to that type. Struct fields (bond/LACP, nexthop, ICMP v4/v6 responses, L2 FDB, conntrack) were changed to gr_clock_ns_t. All timing arithmetic—aging, timeouts, ICMP payload timestamps, periodic scheduling, and CLI/display calculations—now use gr_clock_ns() and convert to seconds via GR_NS_PER_S. Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@modules/policy/cli/conntrack.c`:
- Line 58: Clamp the computed elapsed time before formatting: compute a
signed/unsigned elapsed variable as (now - conn->last_update), check if elapsed
is negative and set it to 0 if so, then pass (elapsed / GR_NS_PER_S) to
gr_table_cell instead of the raw (now - conn->last_update); update the call site
referenced by gr_table_cell and the conn->last_update/now calculation to use
this clamped elapsed value so LAST_UPDATE never displays negative seconds.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e4a55c0c-141f-4bbb-bf8f-66fb4ee6e298
📒 Files selected for processing (26)
api/gr_clock.hcli/main.cmodules/infra/control/bond.hmodules/infra/control/l3_nexthop.cmodules/infra/control/lacp.cmodules/infra/control/nexthop.hmodules/ip/api/gr_ip4.hmodules/ip/control/icmp.cmodules/ip/control/nexthop.cmodules/ip/datapath/arp_output_request.cmodules/ip/datapath/icmp_input.cmodules/ip/datapath/icmp_local_send.cmodules/ip6/api/gr_ip6.hmodules/ip6/control/icmp6.cmodules/ip6/control/nexthop.cmodules/ip6/datapath/icmp6_input.cmodules/ip6/datapath/icmp6_local_send.cmodules/ip6/datapath/ndp_ns_output.cmodules/l2/api/gr_l2.hmodules/l2/cli/fdb.cmodules/l2/control/fdb.cmodules/policy/api/gr_conntrack.hmodules/policy/cli/conntrack.cmodules/policy/control/conntrack.cmodules/policy/control/conntrack.hsmoke/fib_inject.c
| gr_table_cell(table, 7, "%u", ntohs(conn->fwd_flow.src_id)); | ||
| gr_table_cell(table, 8, "%u", ntohs(conn->fwd_flow.dst_id)); | ||
| gr_table_cell(table, 9, "%lu", (now - conn->last_update) / 1000000); | ||
| gr_table_cell(table, 9, "%ld", (now - conn->last_update) / GR_NS_PER_S); |
There was a problem hiding this comment.
Clamp negative elapsed time in LAST_UPDATE display.
Line 58 can print negative age when conn->last_update is newer than the pre-captured now (concurrent updates). Clamp to zero before formatting.
Proposed fix
- gr_table_cell(table, 9, "%ld", (now - conn->last_update) / GR_NS_PER_S);
+ gr_clock_ns_t elapsed = now - conn->last_update;
+ if (elapsed < 0)
+ elapsed = 0;
+ gr_table_cell(table, 9, "%ld", elapsed / GR_NS_PER_S);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| gr_table_cell(table, 9, "%ld", (now - conn->last_update) / GR_NS_PER_S); | |
| gr_clock_ns_t elapsed = now - conn->last_update; | |
| if (elapsed < 0) | |
| elapsed = 0; | |
| gr_table_cell(table, 9, "%ld", elapsed / GR_NS_PER_S); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modules/policy/cli/conntrack.c` at line 58, Clamp the computed elapsed time
before formatting: compute a signed/unsigned elapsed variable as (now -
conn->last_update), check if elapsed is negative and set it to 0 if so, then
pass (elapsed / GR_NS_PER_S) to gr_table_cell instead of the raw (now -
conn->last_update); update the call site referenced by gr_table_cell and the
conn->last_update/now calculation to use this clamped elapsed value so
LAST_UPDATE never displays negative seconds.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
modules/ip/control/icmp.c (1)
176-177:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winDo not free queued mbufs in
icmp_fini; this can double-free shared pool-owned packets.The fini drain should release queue items only, not mbufs, per this module’s ownership contract.
Proposed fix
static void icmp_fini(struct event_base *) { if (pool != NULL) { struct icmp_queue_item *i, *tmp; STAILQ_FOREACH_SAFE (i, &icmp_queue, next, tmp) - icmp_queue_pop(i, true); + icmp_queue_pop(i, false); rte_mempool_free(pool); pool = NULL; } }Based on learnings: "In modules that manage ICMP packet queues (e.g., icmp.c and icmp6.c), do not manually free mbufs from icmp_queue in fini functions if those mbufs are owned by other pktmbuf pools that are freed elsewhere."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modules/ip/control/icmp.c` around lines 176 - 177, The fini path is freeing mbufs from icmp_queue which can double-free pool-owned packets; in icmp_fini replace the draining loop that calls STAILQ_FOREACH_SAFE and icmp_queue_pop(i, true) with a call that only releases queue entries without freeing the underlying mbuf (e.g., call icmp_queue_pop(i, false) or use an existing/pop-only helper), so the queue items are removed but their mbufs are not freed here; ensure the change touches the icmp_fini loop and any call sites referencing icmp_queue_pop to preserve the module's ownership contract.modules/ip6/control/icmp6.c (1)
164-166:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
icmp_finishould not free queued mbufs; this risks double free under shared mbuf ownership.Drain the queue items without freeing mbufs in fini.
Proposed fix
static void icmp_fini(struct event_base *) { if (pool != NULL) { struct icmp_queue_item *i, *tmp; STAILQ_FOREACH_SAFE (i, &icmp_queue, next, tmp) - icmp6_queue_pop(i, true); + icmp6_queue_pop(i, false); rte_mempool_free(pool); pool = NULL; } }Based on learnings: "In modules that manage ICMP packet queues (e.g., icmp.c and icmp6.c), do not manually free mbufs from icmp_queue in fini functions if those mbufs are owned by other pktmbuf pools that are freed elsewhere."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@modules/ip6/control/icmp6.c` around lines 164 - 166, In icmp_fini, stop freeing mbufs owned elsewhere when draining icmp_queue: instead of calling icmp6_queue_pop(i, true) (which frees mbufs), remove/unwrap queue entries without freeing their mbufs (e.g., call icmp6_queue_pop(i, false) or use a removal routine that only unlinks entries) so the mbufs remain owned by their pktmbuf pools; leave the rte_mempool_free(pool) call as-is for the pool cleanup. Ensure you only unlink entries from icmp_queue and do not call any mbuf free paths in icmp6_queue_pop or related helpers during fini.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@modules/infra/control/lacp.c`:
- Around line 176-178: The ternary is evaluated before the addition, so now is
not added to the chosen timeout; change the assignment to add now to the entire
conditional result by grouping the ternary expression—i.e., compute
member->next_tx by adding now to ((member->remote.state & LACP_STATE_FAST) ?
LACP_SHORT_TIMEOUT * GR_NS_PER_S : LACP_LONG_TIMEOUT * GR_NS_PER_S) so that
member->next_tx, member->remote.state, LACP_STATE_FAST, LACP_SHORT_TIMEOUT,
LACP_LONG_TIMEOUT, GR_NS_PER_S and now are used correctly.
---
Outside diff comments:
In `@modules/ip/control/icmp.c`:
- Around line 176-177: The fini path is freeing mbufs from icmp_queue which can
double-free pool-owned packets; in icmp_fini replace the draining loop that
calls STAILQ_FOREACH_SAFE and icmp_queue_pop(i, true) with a call that only
releases queue entries without freeing the underlying mbuf (e.g., call
icmp_queue_pop(i, false) or use an existing/pop-only helper), so the queue items
are removed but their mbufs are not freed here; ensure the change touches the
icmp_fini loop and any call sites referencing icmp_queue_pop to preserve the
module's ownership contract.
In `@modules/ip6/control/icmp6.c`:
- Around line 164-166: In icmp_fini, stop freeing mbufs owned elsewhere when
draining icmp_queue: instead of calling icmp6_queue_pop(i, true) (which frees
mbufs), remove/unwrap queue entries without freeing their mbufs (e.g., call
icmp6_queue_pop(i, false) or use a removal routine that only unlinks entries) so
the mbufs remain owned by their pktmbuf pools; leave the rte_mempool_free(pool)
call as-is for the pool cleanup. Ensure you only unlink entries from icmp_queue
and do not call any mbuf free paths in icmp6_queue_pop or related helpers during
fini.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 175e0d13-a66f-4f1b-94cb-cd34122c209a
📒 Files selected for processing (26)
api/gr_clock.hcli/main.cmodules/infra/control/bond.hmodules/infra/control/l3_nexthop.cmodules/infra/control/lacp.cmodules/infra/control/nexthop.hmodules/ip/api/gr_ip4.hmodules/ip/control/icmp.cmodules/ip/control/nexthop.cmodules/ip/datapath/arp_output_request.cmodules/ip/datapath/icmp_input.cmodules/ip/datapath/icmp_local_send.cmodules/ip6/api/gr_ip6.hmodules/ip6/control/icmp6.cmodules/ip6/control/nexthop.cmodules/ip6/datapath/icmp6_input.cmodules/ip6/datapath/icmp6_local_send.cmodules/ip6/datapath/ndp_ns_output.cmodules/l2/api/gr_l2.hmodules/l2/cli/fdb.cmodules/l2/control/fdb.cmodules/policy/api/gr_conntrack.hmodules/policy/cli/conntrack.cmodules/policy/control/conntrack.cmodules/policy/control/conntrack.hsmoke/fib_inject.c
✅ Files skipped from review due to trivial changes (1)
- cli/main.c
The clock_t type is implementation specific. E.g. on Linux, it is microseconds, and on Windows it is milliseconds. Replace all uses of clock_t and its accompanying function, gr_clock_us(), with a new nanosecond resolution clock type, gr_clock_ns_t, and its accompanying function, gr_clock_ns(). Note about resolution and size of the new clock type: A clock with nanosecond resolution can be used for purposes not possible with microsecond resolution, e.g. shaping and pacing. For reference, the duration of a minimum size (84 byte) packet on a 100 Gbit/s Ethernet link is 6.72 ns. The size of the new type is 64 bit, the same size as the type it replaces (on supported implementations). Changing the resolution to nanoseconds thus does not impact the size of data structures where it is used. Note about signedness of the new clock type: I considered making the gr_clock_ns_t unsigned, but that would require additional considerations in code where it is used, e.g. for calculating age, to prevent wraparound in race conditions. E.g.: age = (gr_clock_ns() - fdb->last_seen) / GR_NS_PER_S; If the current thread reads the clock using gr_clock_ns(), and another thread races to set fdb->last_seen afterwards, the result of the subtraction is negative. The division by GR_NS_PER_S makes the age zero. If gr_clock_ns_t was unsigned, the negative result of the subtraction would be a very large unsigned number. Dividing this very large number by GR_NS_PER_S would be a large unsigned number, not zero. Obviously, this could be fixed by type casting gr_clock_ns_t values to signed int64_t everywhere they are used with subtraction. But such a requirement increases the risk of bugs. So I decided to make it signed, like time_t. Signed-off-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Robin Jarry <rjarry@redhat.com>
The clock_t type is implementation specific. E.g. on Linux, it is microseconds, and on Windows it is milliseconds.
Replace all uses of clock_t and its accompanying function, gr_clock_us(), with a new nanosecond resolution clock type, gr_clock_ns_t, and its accompanying function, gr_clock_ns().
Note about resolution and size of the new clock type:
A clock with nanosecond resolution can be used for purposes not possible with microsecond resolution, e.g. shaping and pacing. For reference, the duration of a minimum size (84 byte) packet on a 100 Gbit/s Ethernet link is 6.72 ns.
The size of the new type is 64 bit, the same size as the type it replaces (on supported implementations). Changing the resolution to nanoseconds thus does not impact the size of data structures where it is used.
Note about signedness of the new clock type:
I considered making the gr_clock_ns_t unsigned, but that would require additional considerations in code where it is used, e.g. for calculating age, to prevent wraparound in race conditions. E.g.:
If the current thread reads the clock using gr_clock_ns(), and another thread races to set fdb->last_seen afterwards, the result of the subtraction is negative. The division by GR_NS_PER_S makes the age zero. If gr_clock_ns_t was unsigned, the negative result of the subtraction would be a very large unsigned number. Dividing this very large number by NS_PER_S would be a large unsigned number, not zero.
Obviously, this could be fixed by type casting gr_clock_ns_t values to signed int64_t everywhere they are used with subtraction. But such a requirement increases the risk of bugs. So I decided to make it signed, like time_t.
API clock type migration to nanosecond resolution
Core API changes
#defineGR_NS_PER_S (gr_clock_ns_t)1000000000LLType and field migrations (clock_t → gr_clock_ns_t)
Timing computation pattern
Rationale documented in commit metadata (kept literal from code/commit)
Other notes