-
Notifications
You must be signed in to change notification settings - Fork 126
Trigger CI #8764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jrife
wants to merge
7
commits into
kernel-patches:bpf-next_base
Choose a base branch
from
jrife:jrife/socket-iterators-udp
base: bpf-next_base
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Trigger CI #8764
jrife
wants to merge
7
commits into
kernel-patches:bpf-next_base
from
jrife:jrife/socket-iterators-udp
+583
−66
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
0bb6482
to
fc760de
Compare
7c1ba0b
to
7488891
Compare
7488891
to
7683167
Compare
305c236
to
40a39da
Compare
Prepare for the next two patches which need to be able to choose either GFP_USER or GFP_ATOMIC for calls to bpf_iter_udp_realloc_batch by making memory flags configurable. Signed-off-by: Jordan Rife <[email protected]>
14a0235
to
00d9d99
Compare
91398f8
to
b192eb3
Compare
Prepare for the next commit by moving iter->end_sk > 0 logic inside the main loop of bpf_iter_udp_batch. This makes it easier to see where spin_unlock_bh(&hslot2->lock) needs to happen and avoids changing the scope of hslot2. Signed-off-by: Jordan Rife <[email protected]>
Require that iter->batch always contains a full bucket snapshot. This invariant is important to avoid skipping or repeating sockets during iteration when combined with the next few patches. Before, there were two cases where a call to bpf_iter_udp_batch may only capture part of a bucket: 1. When bpf_iter_udp_realloc_batch() returns -ENOMEM [1]. 2. When more sockets are added to the bucket while calling bpf_iter_udp_realloc_batch(), making the updated batch size insufficient [2]. In cases where the batch size only covers part of a bucket, it is possible to forget which sockets were already visited, especially if we have to process a bucket in more than two batches. This forces us to choose between repeating or skipping sockets, so don't allow this: 1. Stop iteration and propagate -ENOMEM up to userspace if reallocation fails instead of continuing with a partial batch. 2. Retry bpf_iter_udp_realloc_batch() up to two times if we fail to capture the full bucket. On the third attempt, hold onto the bucket lock (hslot2->lock) through bpf_iter_udp_realloc_batch() with GFP_ATOMIC to guarantee that the bucket size doesn't change before our next attempt. Try with GFP_USER first to improve the chances that memory allocation succeeds; only use GFP_ATOMIC as a last resort. Testing all scenarios directly is a bit difficult, but I did some manual testing to exercise the code paths where GFP_ATOMIC is used and where where ERR_PTR(err) is returned to make sure there are no deadlocks. I used the realloc test case included later in this series to trigger a scenario where a realloc happens inside bpf_iter_udp_realloc_batch and made a small code tweak to force the first two realloc attempts to allocate a too-small buffer, thus requiring another attempt until the GFP_ATOMIC case is hit. Some printks showed three reallocs with the tests passing: Apr 16 00:08:32 crow kernel: go again (mem_flags=GFP_USER) Apr 16 00:08:32 crow kernel: go again (mem_flags=GFP_USER) Apr 16 00:08:32 crow kernel: go again (mem_flags=GFP_ATOMIC) Similarly, I forced bpf_iter_udp_realloc_batch to return -ENOMEM to ensure that iteration ends and that the read() in userspace fails. [1]: https://lore.kernel.org/bpf/CABi4-ogUtMrH8-NVB6W8Xg_F_KDLq=yy-yu-tKr2udXE2Mu1Lg@mail.gmail.com/ [2]: https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Jordan Rife <[email protected]> Suggested-by: Martin KaFai Lau <[email protected]>
Prepare for the next patch that tracks cookies between iterations by converting struct sock **batch to union bpf_udp_iter_batch_item *batch inside struct bpf_udp_iter_state. Signed-off-by: Jordan Rife <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]>
Replace the offset-based approach for tracking progress through a bucket in the UDP table with one based on socket cookies. Remember the cookies of unprocessed sockets from the last batch and use this list to pick up where we left off or, in the case that the next socket disappears between reads, find the first socket after that point that still exists in the bucket and resume from there. In order to make the control flow a bit easier to follow inside bpf_iter_udp_batch, introduce the udp_portaddr_for_each_entry_from macro and use this to split bucket processing into two stages: finding the starting point and adding items to the next batch. Originally, I implemented this patch inside a single udp_portaddr_for_each_entry loop, as it was before, but I found the resulting logic a bit messy. Overall, this version seems more readable. Signed-off-by: Jordan Rife <[email protected]>
Extend the iter_udp_soreuse and iter_tcp_soreuse programs to write the cookie of the current socket, so that we can track the identity of the sockets that the iterator has seen so far. Update the existing do_test function to account for this change to the iterator program output. At the same time, teach both programs to work with AF_INET as well. Signed-off-by: Jordan Rife <[email protected]>
Introduce a set of tests that exercise various bucket resume scenarios: * remove_seen resumes iteration after removing a socket from the bucket that we've already processed. Before, with the offset-based approach, this test would have skipped an unseen socket after resuming iteration. With the cookie-based approach, we now see all sockets exactly once. * remove_unseen exercises the condition where the next socket that we would have seen is removed from the bucket before we resume iteration. This tests the scenario where we need to scan past the first cookie in our remembered cookies list to find the socket from which to resume iteration. * remove_all exercises the condition where all sockets we remembered were removed from the bucket to make sure iteration terminates and returns no more results. * add_some exercises the condition where a few, but not enough to trigger a realloc, sockets are added to the head of the current bucket between reads. Before, with the offset-based approach, this test would have repeated sockets we've already seen. With the cookie-based approach, we now see all sockets exactly once. * force_realloc exercises the condition that we need to realloc the batch on a subsequent read, since more sockets than can be held in the current batch array were added to the current bucket. This exercies the logic inside bpf_iter_udp_realloc_batch that copies cookies into the new batch to make sure nothing is skipped or repeated. Signed-off-by: Jordan Rife <[email protected]>
b192eb3
to
bdd57e4
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
WIP. Trigger CI.