Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Nagara][PDX224] Vendor Bluetooth service crash due to stack corruption in PowerManager::GetAndLogPwrRsrcStates #797

Open
faenil opened this issue Apr 5, 2023 · 3 comments
Labels

Comments

@faenil
Copy link

faenil commented Apr 5, 2023

Platform: Nagara
Device: pdx224 (Xperia 5 IV)
Kernel version: 5.10 from android-13.0.0_r30 branch (commit: a4a72efafecae951d7faa8e2946bdbc4f2f826bb)
Android version: android-13.0.0_r30, repo-update'd
Software binaries version: 64.0.H.4.18 + SW_binaries_for_Xperia_Android_13_5.10_v1b_nagara.zip

The target is -eng variant

Previously working on
n/a

Description
Bluetooth service fails to initialise and crashes in a loop. Log excerpt below:

02-24 00:22:20.265  2153  2224 E [email protected]_handler: InitTimeOut: SoC Initialization stuck detected
02-24 00:22:20.265  2153  2224 I [email protected]: BtPrimaryCrashReason:Init failed
02-24 00:22:20.265  2153  2224 I [email protected]: BtSecondaryCrashReason:SetBaudRateStuck
02-24 00:22:20.265  2153  2224 I [email protected]: TS for SoC Crash:Fri Feb 24 00:22:20 2023
02-24 00:22:20.265  2153  2224 E [email protected]_controller: LogPwrSrcsUartFlowCtrl: Captured UART CTS: 1
02-24 00:22:20.265  2153  2224 I [email protected]_manager: GetAndLogPwrRsrcStates
...
02-24 00:22:20.394  2616  2616 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
02-24 00:22:20.395  2616  2616 F DEBUG   : Build fingerprint: 'Sony/aosp_xqcq54/pdx224:13/TQ1A.230205.002/eng.faenil.20230224.001727:eng/test-keys'
02-24 00:22:20.395  2616  2616 F DEBUG   : Revision: '0'
02-24 00:22:20.395  2616  2616 F DEBUG   : ABI: 'arm64'
02-24 00:22:20.395  2616  2616 F DEBUG   : Timestamp: 2023-02-24 00:22:20.359547911+0000
02-24 00:22:20.395  2616  2616 F DEBUG   : Process uptime: 0s
02-24 00:22:20.395  2616  2616 F DEBUG   : Cmdline: /vendor/bin/hw/[email protected]
02-24 00:22:20.395  2616  2616 F DEBUG   : pid: 2153, tid: 2224, name: POSIX timer 1  >>> /vendor/bin/hw/[email protected] <<<
02-24 00:22:20.395  2616  2616 F DEBUG   : uid: 1002
02-24 00:22:20.395  2616  2616 F DEBUG   : tagged_addr_ctrl: 0000000000000001 (PR_TAGGED_ADDR_ENABLE)
02-24 00:22:20.395  2616  2616 F DEBUG   : pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)
02-24 00:22:20.395  2616  2616 F DEBUG   : signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
02-24 00:22:20.395  2616  2616 F DEBUG   : Abort message: 'stack corruption detected (-fstack-protector)'
02-24 00:22:20.395  2616  2616 F DEBUG   :     x0  0000000000000000  x1  00000000000008b0  x2  0000000000000006  x3  000000721656c4e0
02-24 00:22:20.395  2616  2616 F DEBUG   :     x4  0000000000808080  x5  0000000000808080  x6  0000000000808080  x7  8080808080808080
02-24 00:22:20.395  2616  2616 F DEBUG   :     x8  00000000000000f0  x9  00000074abe489e0  x10 0000000000000001  x11 00000074abe86b14
02-24 00:22:20.395  2616  2616 F DEBUG   :     x12 0101010101010101  x13 000000007fffffff  x14 6f63206b63617473  x15 0000000000000030
02-24 00:22:20.395  2616  2616 F DEBUG   :     x16 00000074abeead50  x17 00000074abec89a0  x18 0000007214c92000  x19 0000000000000869
02-24 00:22:20.395  2616  2616 F DEBUG   :     x20 00000000000008b0  x21 00000000ffffffff  x22 0000000000000197  x23 b4000073abc9adf8
02-24 00:22:20.395  2616  2616 F DEBUG   :     x24 000000721656d000  x25 0000000000000400  x26 000000721656cff8  x27 00000000000fc000
02-24 00:22:20.395  2616  2616 F DEBUG   :     x28 00000000000fe000  x29 000000721656c560
02-24 00:22:20.395  2616  2616 F DEBUG   :     lr  00000074abe78718  sp  000000721656c4c0  pc  00000074abe78744  pst 0000000000001000
02-24 00:22:20.395  2616  2616 F DEBUG   : backtrace:
02-24 00:22:20.395  2616  2616 F DEBUG   :       #00 pc 0000000000051744  /apex/com.android.runtime/lib64/bionic/libc.so (abort+164) (BuildId: a233a8c1a23ee63f36f4b138880288f3)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #01 pc 0000000000066318  /apex/com.android.runtime/lib64/bionic/libc.so (__stack_chk_fail+20) (BuildId: a233a8c1a23ee63f36f4b138880288f3)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #02 pc 0000000000052d2c  /odm/lib64/hw/[email protected] (android::hardware::bluetooth::V1_0::implementation::PowerManager::GetAndLogPwrRsrcStates(char*)+1340) (BuildId: 4b8421211099f3c8719052db14b5da94)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #03 pc 000000000003a1c8  /odm/lib64/hw/[email protected] (android::hardware::bluetooth::V1_0::implementation::UartController::LogPwrSrcsUartFlowCtrl()+256) (BuildId: 4b8421211099f3c8719052db14b5da94)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #04 pc 000000000003a08c  /odm/lib64/hw/[email protected] (android::hardware::bluetooth::V1_0::implementation::UartController::SsrCleanup(android::hardware::bluetooth::V1_0::implementation::PrimaryReasonCode)+696) (BuildId: 4b8421211099f3c8719052db14b5da94)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #05 pc 0000000000037310  /odm/lib64/hw/[email protected] (android::hardware::bluetooth::V1_0::implementation::DataHandler::InitTimeOut(sigval)+152) (BuildId: 4b8421211099f3c8719052db14b5da94)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #06 pc 000000000005e230  /apex/com.android.runtime/lib64/bionic/libc.so (__timer_thread_start(void*)+136) (BuildId: a233a8c1a23ee63f36f4b138880288f3)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #07 pc 00000000000b6090  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+204) (BuildId: a233a8c1a23ee63f36f4b138880288f3)
02-24 00:22:20.395  2616  2616 F DEBUG   :       #08 pc 0000000000052e68  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: a233a8c1a23ee63f36f4b138880288f3)

Symptoms
The device overheats while it keeps trying to start bluetooth HAL (and other crashing services)
(Bluetooth is likely non functional)

How to reproduce
Build the kernel via the build-clang-shared.sh script (prebuilt kernel for this device is not yet released)
Build and flash the OS image.
Boot the device.
Wait for a few minutes and notice the temperature increase.

Additional context
first_logcat_5Apr.zip

@faenil faenil added the bug label Apr 5, 2023
@MarijnS95
Copy link
Contributor

MarijnS95 commented Apr 5, 2023

Shouldn't the latest sync of repo_update undo this? sonyxperiadev/repo_update@42b57c3.

Same for the other two bullet-points: it has all been merged so should implicitly be there on a sync.

@faenil
Copy link
Author

faenil commented Apr 5, 2023

Thanks, I played it safe under the assumption that the _r30 manifest would be using a fixed list of commits, but it turns out it just references the branch (with pros and cons of that solution), so the latest commits will already be part of a fresh sync, as you mentioned.

Updated.

@MarijnS95
Copy link
Contributor

MarijnS95 commented Apr 7, 2023

Cool! Yeah in my case I'm shifting around my personal branches (since aforementioned changes and more have been merged) so it's not a consistent view of the state: reference commits/ranges with hashes to err on the side of caution.

EDIT: On the note of pros and cons, we discussed branching strategy and tagging many times, but never got anything out of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants