Skip to content

Trigger CI #8764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: bpf-next_base
Choose a base branch
from

Conversation

jrife
Copy link
Contributor

@jrife jrife commented Apr 9, 2025

WIP. Trigger CI.

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 5 times, most recently from 0bb6482 to fc760de Compare April 10, 2025 03:15
@jrife jrife force-pushed the jrife/socket-iterators-udp branch from 7c1ba0b to 7488891 Compare April 11, 2025 15:38
@jrife jrife closed this Apr 11, 2025
@jrife jrife reopened this Apr 11, 2025
@jrife jrife closed this Apr 11, 2025
@jrife jrife force-pushed the jrife/socket-iterators-udp branch from 7488891 to 7683167 Compare April 11, 2025 16:21
@jrife jrife reopened this Apr 11, 2025
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 305c236 to 40a39da Compare April 15, 2025 18:40
Prepare for the next two patches which need to be able to choose either
GFP_USER or GFP_ATOMIC for calls to bpf_iter_udp_realloc_batch by making
memory flags configurable.

Signed-off-by: Jordan Rife <[email protected]>
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 2 times, most recently from 14a0235 to 00d9d99 Compare April 15, 2025 22:29
@jrife jrife force-pushed the jrife/socket-iterators-udp branch 2 times, most recently from 91398f8 to b192eb3 Compare April 16, 2025 00:56
jrife added 6 commits April 15, 2025 18:03
Prepare for the next commit by moving iter->end_sk > 0 logic inside the
main loop of bpf_iter_udp_batch. This makes it easier to see where
spin_unlock_bh(&hslot2->lock) needs to happen and avoids changing the
scope of hslot2.

Signed-off-by: Jordan Rife <[email protected]>
Require that iter->batch always contains a full bucket snapshot. This
invariant is important to avoid skipping or repeating sockets during
iteration when combined with the next few patches. Before, there were
two cases where a call to bpf_iter_udp_batch may only capture part of a
bucket:

1. When bpf_iter_udp_realloc_batch() returns -ENOMEM [1].
2. When more sockets are added to the bucket while calling
   bpf_iter_udp_realloc_batch(), making the updated batch size
   insufficient [2].

In cases where the batch size only covers part of a bucket, it is
possible to forget which sockets were already visited, especially if we
have to process a bucket in more than two batches. This forces us to
choose between repeating or skipping sockets, so don't allow this:

1. Stop iteration and propagate -ENOMEM up to userspace if reallocation
   fails instead of continuing with a partial batch.
2. Retry bpf_iter_udp_realloc_batch() up to two times if we fail to
   capture the full bucket. On the third attempt, hold onto the bucket
   lock (hslot2->lock) through bpf_iter_udp_realloc_batch() with
   GFP_ATOMIC to guarantee that the bucket size doesn't change before
   our next attempt. Try with GFP_USER first to improve the chances
   that memory allocation succeeds; only use GFP_ATOMIC as a last
   resort.

Testing all scenarios directly is a bit difficult, but I did some manual
testing to exercise the code paths where GFP_ATOMIC is used and where
where ERR_PTR(err) is returned to make sure there are no deadlocks. I
used the realloc test case included later in this series to trigger a
scenario where a realloc happens inside bpf_iter_udp_realloc_batch and
made a small code tweak to force the first two realloc attempts to
allocate a too-small buffer, thus requiring another attempt until the
GFP_ATOMIC case is hit. Some printks showed three reallocs with the
tests passing:

Apr 16 00:08:32 crow kernel: go again (mem_flags=GFP_USER)
Apr 16 00:08:32 crow kernel: go again (mem_flags=GFP_USER)
Apr 16 00:08:32 crow kernel: go again (mem_flags=GFP_ATOMIC)

Similarly, I forced bpf_iter_udp_realloc_batch to return -ENOMEM to
ensure that iteration ends and that the read() in userspace fails.

[1]: https://lore.kernel.org/bpf/CABi4-ogUtMrH8-NVB6W8Xg_F_KDLq=yy-yu-tKr2udXE2Mu1Lg@mail.gmail.com/
[2]: https://lore.kernel.org/bpf/[email protected]/

Signed-off-by: Jordan Rife <[email protected]>
Suggested-by: Martin KaFai Lau <[email protected]>
Prepare for the next patch that tracks cookies between iterations by
converting struct sock **batch to union bpf_udp_iter_batch_item *batch
inside struct bpf_udp_iter_state.

Signed-off-by: Jordan Rife <[email protected]>
Reviewed-by: Kuniyuki Iwashima <[email protected]>
Replace the offset-based approach for tracking progress through a bucket
in the UDP table with one based on socket cookies. Remember the cookies
of unprocessed sockets from the last batch and use this list to
pick up where we left off or, in the case that the next socket
disappears between reads, find the first socket after that point that
still exists in the bucket and resume from there.

In order to make the control flow a bit easier to follow inside
bpf_iter_udp_batch, introduce the udp_portaddr_for_each_entry_from macro
and use this to split bucket processing into two stages: finding the
starting point and adding items to the next batch. Originally, I
implemented this patch inside a single udp_portaddr_for_each_entry loop,
as it was before, but I found the resulting logic a bit messy. Overall,
this version seems more readable.

Signed-off-by: Jordan Rife <[email protected]>
Extend the iter_udp_soreuse and iter_tcp_soreuse programs to write the
cookie of the current socket, so that we can track the identity of the
sockets that the iterator has seen so far. Update the existing do_test
function to account for this change to the iterator program output. At
the same time, teach both programs to work with AF_INET as well.

Signed-off-by: Jordan Rife <[email protected]>
Introduce a set of tests that exercise various bucket resume scenarios:

* remove_seen resumes iteration after removing a socket from the bucket
  that we've already processed. Before, with the offset-based approach,
  this test would have skipped an unseen socket after resuming
  iteration. With the cookie-based approach, we now see all sockets
  exactly once.
* remove_unseen exercises the condition where the next socket that we
  would have seen is removed from the bucket before we resume iteration.
  This tests the scenario where we need to scan past the first cookie in
  our remembered cookies list to find the socket from which to resume
  iteration.
* remove_all exercises the condition where all sockets we remembered
  were removed from the bucket to make sure iteration terminates and
  returns no more results.
* add_some exercises the condition where a few, but not enough to
  trigger a realloc, sockets are added to the head of the current bucket
  between reads. Before, with the offset-based approach, this test would
  have repeated sockets we've already seen. With the cookie-based
  approach, we now see all sockets exactly once.
* force_realloc exercises the condition that we need to realloc the
  batch on a subsequent read, since more sockets than can be held in the
  current batch array were added to the current bucket. This exercies
  the logic inside bpf_iter_udp_realloc_batch that copies cookies into
  the new batch to make sure nothing is skipped or repeated.

Signed-off-by: Jordan Rife <[email protected]>
@jrife jrife force-pushed the jrife/socket-iterators-udp branch from b192eb3 to bdd57e4 Compare April 16, 2025 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant