I've been trying to optimize pyOCD flash write speed with my debug probe. After configuring trace debug logs (thanks Chris!), I noticed less than half the time is spent doing block writes. Most of the time seems to be spent generating the AP read and write commands in between block transfers. Here's an excerpt from the logs:
0002482:DEBUG:ap:_write_block32:001014 }
0002482:DEBUG:ap:read_mem:001018 (ap=0x0; addr=0xe000edf0, size=32) {
0002482:DEBUG:ap:write_ap:001019 cached (ap=0x0; addr=0x00000000) = 0x23000012
0002482:DEBUG:dap:write_ap:001020 (addr=0x00000004) = 0xe000edf0
0002482:DEBUG:dap_access_cmsis_dap:get_request_space(1, 05:w)[wc=7, rc=0, ba=1->0] -> (sz=1, free=5, delta=-247)
0002482:DEBUG:dap_access_cmsis_dap:add(1, 05:w) -> [wc=8, rc=0, ba=0]
0002482:DEBUG:dap_access_cmsis_dap:get_request_space(1, 0f:r)[wc=8, rc=0, ba=0->0] -> (sz=1, free=15, delta=-246)
0002482:DEBUG:dap_access_cmsis_dap:add(1, 0f:r) -> [wc=8, rc=1, ba=0]
0002483:DEBUG:dap:read_ap:001021 (addr=0x0000000c) -> ...
0002483:DEBUG:dap_access_cmsis_dap:New _Command
0002485:DEBUG:dap:read_ap:001021 ...(addr=0x0000000c) -> 0x01030003
0002485:DEBUG:ap:read_mem:001018 (ap=0x0; addr=0xe000edf0, size=32) -> 0x01030003 }
0002485:DEBUG:ap:read_mem:001022 (ap=0x0; addr=0xe000edf0, size=32) {
0002485:DEBUG:ap:write_ap:001023 cached (ap=0x0; addr=0x00000000) = 0x23000012
0002485:DEBUG:dap:write_ap:001024 (addr=0x00000004) = 0xe000edf0
0002485:DEBUG:dap_access_cmsis_dap:get_request_space(1, 05:w)[wc=0, rc=0, ba=1->1] -> (sz=1, free=14)
0002485:DEBUG:dap_access_cmsis_dap:add(1, 05:w) -> [wc=1, rc=0, ba=1]
0002485:DEBUG:dap_access_cmsis_dap:get_request_space(1, 0f:r)[wc=1, rc=0, ba=1->0] -> (sz=1, free=15, delta=-253)
0002485:DEBUG:dap_access_cmsis_dap:add(1, 0f:r) -> [wc=1, rc=1, ba=0]
0002485:DEBUG:dap:read_ap:001025 (addr=0x0000000c) -> ...
0002486:DEBUG:dap_access_cmsis_dap:New _Command
0002488:DEBUG:dap:read_ap:001025 ...(addr=0x0000000c) -> 0x00030003
0002488:DEBUG:ap:read_mem:001022 (ap=0x0; addr=0xe000edf0, size=32) -> 0x00030003 }
0002488:DEBUG:ap:write_mem:001026 (ap=0x0; addr=0xe000edf4, size=32) = 0x00000000 {
0002488:DEBUG:ap:write_ap:001027 cached (ap=0x0; addr=0x00000000) = 0x23000012
0002488:DEBUG:dap:write_ap:001028 (addr=0x00000004) = 0xe000edf4
0002488:DEBUG:dap_access_cmsis_dap:get_request_space(1, 05:w)[wc=0, rc=0, ba=1->1] -> (sz=1, free=14)
It takes until the 2504ms timestamp to get a full packet to write. In total from the end of one write_block32 until the start of the next write_block32 is 45ms. pyOCD seems to be doing some sort of caching of the parsed flash algorithm, as the time between the 1st and 2nd block writes is about twice as long (~100ms). Is there any easy way to speed this up? The host CPU is a 3.2GHz Intel Core 64-bit.
I've been trying to optimize pyOCD flash write speed with my debug probe. After configuring trace debug logs (thanks Chris!), I noticed less than half the time is spent doing block writes. Most of the time seems to be spent generating the AP read and write commands in between block transfers. Here's an excerpt from the logs:
It takes until the 2504ms timestamp to get a full packet to write. In total from the end of one write_block32 until the start of the next write_block32 is 45ms. pyOCD seems to be doing some sort of caching of the parsed flash algorithm, as the time between the 1st and 2nd block writes is about twice as long (~100ms). Is there any easy way to speed this up? The host CPU is a 3.2GHz Intel Core 64-bit.