Skip to content

Conversation

@cvinayak
Copy link
Contributor

nRF Connect SDK v2.6.4 NCSIDB-1718 cherry-picks.

Tronil and others added 5 commits October 21, 2025 10:20
Fix l2cap error handling generally not properly disposing of tx buffers for
enhanced channels; Any callbacks have to be called and the
l2cap_tx_meta_data has to be freed

Signed-off-by: Troels Nilsson <[email protected]>
(cherry picked from commit f0032a3)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
It seems like a nice idea at first, but leads to hard-to-debug
situations for the application.

The previous behavior can be implemented by the app by defining
`alloc_seg` and allocating from the same pool as `buf`.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 75c2aeb)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
…annel

We could start executing the work item after the channel has been
disconnected or destroyed, due to a race condition.

Double-check we are connected before attempting to send data.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit e436441)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
net_buf_alloc(K_FOREVER) can now fail (if run from the syswq). Propagate to
the caller instead of asserting.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit aa30d07)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
If bt_conn_set_state(conn, BT_CONN_DISCONNECTED) is called
while the connection is already disconnected, this triggers
a warning. This is likely to happen when bt_conn_cleanup_all
is called as part of bt_disable.

Added the state check to avoid unnecessary warnings in the log.

Signed-off-by: Emil Gydesen <[email protected]>
(cherry picked from commit 4a97746)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch from b258142 to 545f89f Compare October 21, 2025 08:24
jori-nordic and others added 12 commits October 21, 2025 12:03
This function call frees the buffer kept by the host for reassembling L2CAP
PDUs into.

Without this call, the current buffer will eventually be
leaked, leading to a non-functional host due to lack of RX buffers.

The effect is worse when host flow control is not enabled, as the RX
buffer pool is shared with events, which means communication with the
controller is essentially dead.

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 8a2fe27)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
When disconnected only the first empty slot in the disconnected_handles
array should be updated.

Signed-off-by: Jens Rehhoff Thomsen <[email protected]>
(cherry picked from commit b478ffe)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Use actual user_data size not default by 8.

Signed-off-by: Lingao Meng <[email protected]>
(cherry picked from commit 0ddb6aa)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
`bt_buf_get_cmd_complete` is broken due to
zephyrproject-rtos/zephyr#64158, and fixing it
would require changing its signature and put even more complexity into
the HCI drivers, as it would require the drivers to perform an even
deeper peek into the event in order to observe the opcode.

Instead of the above, this patch removes the use of
`bt_buf_get_cmd_complete` and adds logic to allow the host to accept
command complete events in normal event buffers.

The above means performing a copy into the destination buffer, which is
the original command buffer. This is a small inefficiency for now, but
we should strive to redesign the host into a streaming architecture as
much as possible and handle events immediately instead of retaining
buffers.

This fixes zephyrproject-rtos/zephyr#64158:
Like all command completed events, the completion event for
`BT_HCI_OP_HOST_NUM_COMPLETED_PACKETS` is now placed in normal event
buffers. The the logic where the host discards this event is already
present. Since it's discarded, it will not interfere with the logic
around `bt_dev.cmd_send`.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit 1cb83a8)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
After the previous commit, this function no longer has any users.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit b6a1051)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
This is purely a syntactical refactor.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit 9426309)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Add 2 new Kconfig promptless options that are shorthand
for whether the ISO configuration can support RX and TX.

This also applies these new options as guards for existing
and missing code pieces.

Signed-off-by: Emil Gydesen <[email protected]>
(cherry picked from commit 9553347)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Refactor only. The surrounding ifdefs are intentionally not changed in
this patch. They will be in the near future.

Rename the pool and generalize the documentation to allow using this
pool for other events that fit the same criteria. This pool can be used
for any buffer that is processed synchronously, without negatively
affecting 'num complete' messages. E.g. 'cmd complete/status' can be put
in this pool already.

We will be working towards making the host process all event buffers
synchronously. This is because events have no dedicated flow control,
and discarding events in the driver without informing the host creates
problems. Discarding should instead happen in the host higher layers
when unavoidable.

Signed-off-by: Aleksander Wasaznik <[email protected]>
(cherry picked from commit 2a7adae)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
…_pool

`struct acl_data` is used even when Host flow control is not enabled.
It is written to through the `acl(buf)` accessor in `conn.c:hci_acl()`.

Hopefully no netbufs were harmed by that :/

Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 792ae68)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
…f pool

Why is it ok to use the sync pool?

Because command complete/status is processed in prio: that means on the
same stack as the `bt_recv()` call from the driver.

Why does it fix the issue?

Because the complete/status event goes into a pool that is guaranteed to
have one free buffer any time `bt_recv()` is not executing.

Since the driver is the one calling bt_recv(), it (hopefully) will
finish one `bt_recv()` before starting another one.

Fixes #78223

Co-authored-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Aleksander Wasaznik <[email protected]>
Signed-off-by: Jonathan Rico <[email protected]>
(cherry picked from commit 6d5cce6)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
Because the number of ACL RX buffers must be at least the number of
maximum connections plus one, increasing `CONFIG_BT_MAX_CONN` could
inadvertently lead to a build failure if the number of ACL RX buffers is
not also increased. This dependency may not be obvious to users.

To address this issue, this commit deprecates the
`CONFIG_BT_BUF_RX_COUNT` Kconfig symbol and computes the value in
`buf.h` using the new `BT_BUF_RX_COUNT` define. Note that the default
value and the minimum range value have been changed to 0 to "disable"
the option.

Additionally, to allow users to increase the number of ACL RX buffers,
this commit introduces the new `CONFIG_BT_BUF_RX_COUNT_EXTRA` Kconfig
symbol. The value of this symbol will be added to the computed value of
`BT_BUF_RX_COUNT`.

The configurations of tests and samples have been updated to reflect
these changes.

Signed-off-by: Théo Battrel <[email protected]>
(cherry picked from commit 66ff97e)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
…on failure

Fix HCI command buffer allocation failure, that can cause
loss of Host Number of Completed Packets command.

Fail by rejecting the HCI Host Buffer Size command if the
required number of HCI command buffers are not allocated in
the Controller implementation.

When Controller to Host data flow control is supported in
the Controller only build, ensure that BT_BUF_CMD_TX_COUNT
is greater than or equal to (BT_BUF_RX_COUNT + Ncmd),
where Ncmd is supported maximum Num_HCI_Command_Packets in
the Controller implementation.

Relates to commit 8161430 ("Bluetooth: Add workaround
for no command buffer available")'.

Relates to commit 297f4f4 ("Bluetooth: Split HCI
command & event buffers to two pools").

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
(cherry picked from commit d382fca)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch from 545f89f to fd7884b Compare October 21, 2025 10:07
Fix use of HAS_BT_CTLR with BT_CTLR due to previous cherry pick
commit.

Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
PavelVPV and others added 3 commits October 21, 2025 12:39
…pool

The Bluetooth data buffer API currently lacks a mechanism to notify when
a buffer is freed in the RX pool. This limitation forces HCI drivers to
adopt inefficient workarounds to manage buffer allocation.

HCI drivers face two suboptimal options:

- Blocking calls: Use bt_buf_get_rx with K_FOREVER, which blocks the
  execution context until a buffer becomes available.
- Polling: Repeatedly call bt_buf_get_rx with K_NO_WAIT, which increases
  CPU load and reduces efficiency.

This commit introduces a callback mechanism that is triggered each time
a buffer is freed in the RX pool. With this feature, HCI drivers can:

- Call bt_buf_get_rx with K_NO_WAIT.
- Wait for the callback notification if a NULL buffer is returned,
  avoiding unnecessary polling.

The new callback improves efficiency by enabling event-driven behavior
for buffer management, reducing CPU overhead while maintaining
responsiveness.

Signed-off-by: Pavel Vasilyev <[email protected]>
(cherry picked from commit c2488fd)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
…em workqueue

The buffer allocation in conn.c will trigger warnings if we try to use
anything else than K_NO_WAIT for the timeout when called from within the
system workqueue.

The calls in l2cap.c and att.c which may pass non-zero timeouts already
have proper handling for failed allocations, so make sure we use K_NO_WAIT
to avoid unnecessary warnings from conn.c.

Signed-off-by: Johan Hedberg <[email protected]>
(cherry picked from commit 05b16b9)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
… BT RX thread

This commit alignes the timeout value for allocating buffers within att
on the BT RX thread, making it consistent within att.c, see
bt_att_req_alloc.

We are inferring in many bt_gatt_* functions that if called from a BT RX
thread (which is inherently the case if called from a callback when
running a Bluetooth application), we don't block and instead return
-ENOMEM when the ATT request queue is full, avoiding a deadlock.
This promise is fulfilled within bt_att_req_alloc, where the timeout for
allocation of the request slab is set to K_NO_WAIT if we are on the BT
RX thread. Unfortunately, we break this promise in
bt_att_chan_create_pdu, where the timeout for allocation of the att pool
is still K_FOREVER and deadlocks can (and do) occur when too many
requests are sent yet the pool is depleted.

Note: Both req_slab and att_pool sizes are defined by
CONFIG_BT_ATT_TX_COUNT. If applications start getting -ENOMEM with this
change, they were at risk of such a deadlock, and may increase
CONFIG_BT_ATT_TX_COUNT to allocate the att pool for their requests.

Note: This possible deadlock has been flying under the radar, as
att_pools are freed when the HCI driver has sent it to the controller
(instead of when receiving the response, as it happens with req_slabs)
and due to the att_pool and the req_slab being both sized by
CONFIG_BT_ATT_TX_COUNT, and req_slab being allocated before and
returning -ENOMEM already if there is no space, it takes a more specific
situation to deplete the att_pool but not the req_slab pool at this
point.

Note: Ideally, we don't want functions to behave differently depending
on which thread they are running, and while this commit makes it more
consistent, it should be considered a workaround solution.

Signed-off-by: Kyra Lengfeld <[email protected]>
(cherry picked from commit 6464ffa)
Signed-off-by: Vinayak Kariappa Chettimada <[email protected]>
@cvinayak cvinayak force-pushed the github_v3_5_99_ncs1_branch_ncsidb_1718 branch from fd7884b to cc62268 Compare October 21, 2025 10:40
@alwa-nordic
Copy link
Contributor

Instructions for Copilot reviewer: These are cherry-picks. Do NOT suggest edits. Process mandates that the cherry-picks auto merge with no manual resolutions or edits. DO evaluate whether the automatic merge is semantically correct, and if it's not, which parent commits from the source need to be added to avoid an error.

@alwa-nordic alwa-nordic requested a review from Copilot October 22, 2025 08:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request implements NCSIDB-1718 cherry-picks for nRF Connect SDK v2.6.4, primarily focused on improving HCI buffer management and command flow control in the Bluetooth subsystem. The changes address buffer allocation issues, particularly around the CONFIG_BT_BUF_CMD_TX_COUNT workaround that was previously needed for Host Number of Completed Packets commands.

Key changes include:

  • Introduction of dynamic HCI command buffer calculation based on flow control configuration
  • Removal of deprecated bt_buf_get_cmd_complete() API and associated workarounds
  • Addition of buffer freed callbacks for better resource management
  • Refactoring of ISO TX/RX configuration with new CONFIG_BT_ISO_TX and CONFIG_BT_ISO_RX options

Reviewed Changes

Copilot reviewed 48 out of 48 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
subsys/bluetooth/common/hci_common_internal.h New header defining HCI buffer count calculations and flow control logic
subsys/bluetooth/common/Kconfig Updated buffer configuration options with new BT_BUF_ACL_RX_COUNT_EXTRA and improved BT_BUF_CMD_TX_COUNT documentation
subsys/bluetooth/host/buf.c Refactored buffer allocation with freed callbacks and removed bt_buf_get_cmd_complete()
subsys/bluetooth/host/hci_core.c Updated command completion handling to use original command buffers for responses
subsys/bluetooth/host/l2cap.c Added l2cap_tx_buf_destroy() helper and improved error handling with callback invocation
subsys/bluetooth/host/iso.c Separated ISO TX/RX functionality with new config guards and buffer freed callbacks
subsys/bluetooth/controller/hci/hci.c Enhanced Host Buffer Size command validation and flow control checks
Multiple sample/test configs Removed CONFIG_BT_BUF_CMD_TX_COUNT=10 workaround entries

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@alwa-nordic
Copy link
Contributor

Doesn't look like copilot reads PR comments.

@carlescufi carlescufi merged commit c1c2532 into nrfconnect:v3.5.99-ncs1-branch Oct 23, 2025
11 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.