-
Notifications
You must be signed in to change notification settings - Fork 19.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
kernel: backport fix for a page pool related race condition
Signed-off-by: Felix Fietkau <[email protected]>
- Loading branch information
Showing
2 changed files
with
170 additions
and
0 deletions.
There are no files selected for viewing
85 changes: 85 additions & 0 deletions
85
...generic/backport-5.10/633-v6.3-skbuff-Fix-a-race-between-coalescing-and-releasing-S.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
From: Liang Chen <[email protected]> | ||
Date: Thu, 13 Apr 2023 17:03:53 +0800 | ||
Subject: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs | ||
|
||
Commit 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment | ||
recycling") allowed coalescing to proceed with non page pool page and page | ||
pool page when @from is cloned, i.e. | ||
|
||
to->pp_recycle --> false | ||
from->pp_recycle --> true | ||
skb_cloned(from) --> true | ||
|
||
However, it actually requires skb_cloned(@from) to hold true until | ||
coalescing finishes in this situation. If the other cloned SKB is | ||
released while the merging is in process, from_shinfo->nr_frags will be | ||
set to 0 toward the end of the function, causing the increment of frag | ||
page _refcount to be unexpectedly skipped resulting in inconsistent | ||
reference counts. Later when SKB(@to) is released, it frees the page | ||
directly even though the page pool page is still in use, leading to | ||
use-after-free or double-free errors. So it should be prohibited. | ||
|
||
The double-free error message below prompted us to investigate: | ||
BUG: Bad page state in process swapper/1 pfn:0e0d1 | ||
page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000 | ||
index:0x2 pfn:0xe0d1 | ||
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) | ||
raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000 | ||
raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000 | ||
page dumped because: nonzero _refcount | ||
|
||
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 6.2.0+ | ||
Call Trace: | ||
<IRQ> | ||
dump_stack_lvl+0x32/0x50 | ||
bad_page+0x69/0xf0 | ||
free_pcp_prepare+0x260/0x2f0 | ||
free_unref_page+0x20/0x1c0 | ||
skb_release_data+0x10b/0x1a0 | ||
napi_consume_skb+0x56/0x150 | ||
net_rx_action+0xf0/0x350 | ||
? __napi_schedule+0x79/0x90 | ||
__do_softirq+0xc8/0x2b1 | ||
__irq_exit_rcu+0xb9/0xf0 | ||
common_interrupt+0x82/0xa0 | ||
</IRQ> | ||
<TASK> | ||
asm_common_interrupt+0x22/0x40 | ||
RIP: 0010:default_idle+0xb/0x20 | ||
|
||
Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool") | ||
Signed-off-by: Liang Chen <[email protected]> | ||
Reviewed-by: Eric Dumazet <[email protected]> | ||
Link: https://lore.kernel.org/r/[email protected] | ||
Signed-off-by: Jakub Kicinski <[email protected]> | ||
--- | ||
|
||
--- a/net/core/skbuff.c | ||
+++ b/net/core/skbuff.c | ||
@@ -5208,18 +5208,18 @@ bool skb_try_coalesce(struct sk_buff *to | ||
if (skb_cloned(to)) | ||
return false; | ||
|
||
- /* In general, avoid mixing slab allocated and page_pool allocated | ||
- * pages within the same SKB. However when @to is not pp_recycle and | ||
- * @from is cloned, we can transition frag pages from page_pool to | ||
- * reference counted. | ||
- * | ||
- * On the other hand, don't allow coalescing two pp_recycle SKBs if | ||
- * @from is cloned, in case the SKB is using page_pool fragment | ||
+ /* In general, avoid mixing page_pool and non-page_pool allocated | ||
+ * pages within the same SKB. Additionally avoid dealing with clones | ||
+ * with page_pool pages, in case the SKB is using page_pool fragment | ||
* references (PP_FLAG_PAGE_FRAG). Since we only take full page | ||
* references for cloned SKBs at the moment that would result in | ||
* inconsistent reference counts. | ||
+ * In theory we could take full references if @from is cloned and | ||
+ * !@to->pp_recycle but its tricky (due to potential race with | ||
+ * the clone disappearing) and rare, so not worth dealing with. | ||
*/ | ||
- if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from))) | ||
+ if (to->pp_recycle != from->pp_recycle || | ||
+ (from->pp_recycle && skb_cloned(from))) | ||
return false; | ||
|
||
if (len <= skb_tailroom(to)) { |
85 changes: 85 additions & 0 deletions
85
...generic/backport-5.15/612-v6.3-skbuff-Fix-a-race-between-coalescing-and-releasing-S.patch
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
From: Liang Chen <[email protected]> | ||
Date: Thu, 13 Apr 2023 17:03:53 +0800 | ||
Subject: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs | ||
|
||
Commit 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment | ||
recycling") allowed coalescing to proceed with non page pool page and page | ||
pool page when @from is cloned, i.e. | ||
|
||
to->pp_recycle --> false | ||
from->pp_recycle --> true | ||
skb_cloned(from) --> true | ||
|
||
However, it actually requires skb_cloned(@from) to hold true until | ||
coalescing finishes in this situation. If the other cloned SKB is | ||
released while the merging is in process, from_shinfo->nr_frags will be | ||
set to 0 toward the end of the function, causing the increment of frag | ||
page _refcount to be unexpectedly skipped resulting in inconsistent | ||
reference counts. Later when SKB(@to) is released, it frees the page | ||
directly even though the page pool page is still in use, leading to | ||
use-after-free or double-free errors. So it should be prohibited. | ||
|
||
The double-free error message below prompted us to investigate: | ||
BUG: Bad page state in process swapper/1 pfn:0e0d1 | ||
page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000 | ||
index:0x2 pfn:0xe0d1 | ||
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff) | ||
raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000 | ||
raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000 | ||
page dumped because: nonzero _refcount | ||
|
||
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G E 6.2.0+ | ||
Call Trace: | ||
<IRQ> | ||
dump_stack_lvl+0x32/0x50 | ||
bad_page+0x69/0xf0 | ||
free_pcp_prepare+0x260/0x2f0 | ||
free_unref_page+0x20/0x1c0 | ||
skb_release_data+0x10b/0x1a0 | ||
napi_consume_skb+0x56/0x150 | ||
net_rx_action+0xf0/0x350 | ||
? __napi_schedule+0x79/0x90 | ||
__do_softirq+0xc8/0x2b1 | ||
__irq_exit_rcu+0xb9/0xf0 | ||
common_interrupt+0x82/0xa0 | ||
</IRQ> | ||
<TASK> | ||
asm_common_interrupt+0x22/0x40 | ||
RIP: 0010:default_idle+0xb/0x20 | ||
|
||
Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool") | ||
Signed-off-by: Liang Chen <[email protected]> | ||
Reviewed-by: Eric Dumazet <[email protected]> | ||
Link: https://lore.kernel.org/r/[email protected] | ||
Signed-off-by: Jakub Kicinski <[email protected]> | ||
--- | ||
|
||
--- a/net/core/skbuff.c | ||
+++ b/net/core/skbuff.c | ||
@@ -5397,18 +5397,18 @@ bool skb_try_coalesce(struct sk_buff *to | ||
if (skb_cloned(to)) | ||
return false; | ||
|
||
- /* In general, avoid mixing slab allocated and page_pool allocated | ||
- * pages within the same SKB. However when @to is not pp_recycle and | ||
- * @from is cloned, we can transition frag pages from page_pool to | ||
- * reference counted. | ||
- * | ||
- * On the other hand, don't allow coalescing two pp_recycle SKBs if | ||
- * @from is cloned, in case the SKB is using page_pool fragment | ||
+ /* In general, avoid mixing page_pool and non-page_pool allocated | ||
+ * pages within the same SKB. Additionally avoid dealing with clones | ||
+ * with page_pool pages, in case the SKB is using page_pool fragment | ||
* references (PP_FLAG_PAGE_FRAG). Since we only take full page | ||
* references for cloned SKBs at the moment that would result in | ||
* inconsistent reference counts. | ||
+ * In theory we could take full references if @from is cloned and | ||
+ * !@to->pp_recycle but its tricky (due to potential race with | ||
+ * the clone disappearing) and rare, so not worth dealing with. | ||
*/ | ||
- if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from))) | ||
+ if (to->pp_recycle != from->pp_recycle || | ||
+ (from->pp_recycle && skb_cloned(from))) | ||
return false; | ||
|
||
if (len <= skb_tailroom(to)) { |