-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Precalculated swap log for supporting devices with bigger write block sizes #2122
Comments
To clarify. |
Actually that is point 3 in unresolved question. I guess that we can have it following ways, probably configurable:
|
I don't see how it could be done without having the hash table on the flash, if a device is rebooting during a page being swapped, how are you going to know which part goes with which partial image? One sector will be corrupted. If you don't have a checksum/hash of the sectors for each image, you can't know where that is, nor can you re-generate an "in RAM" hash list because then you have a huge number of operations to try and put a bunch of hashes together to come up with the "sha of the sha table" |
No, we are intended to have it on flash, the question is where we get it from when we start the swap (and then where we place it). |
Hello all, I believe it is a very good idea to implement more efficient algorithm for image swapping. For optimal operation the whole swapping process may need to be altered as current swapping methods have significant limitations and/or drawbacks. Let me share our analysis and description of the design we have been thinking about. Generic thoughts:The swapping using scratch area is not ideal due to FLASH wear of the scratch and unnecessary FLASH operations performed. The swapping with move-by-one approach scatters the wear across larger area of FLASH but there are still ~twice as many FLASH operations than required. It would be fairly easy to handle offset of the image in the secondary/staging slot, because there is no strong (technical) requirement for the placement of the image (the code never gets executed directly from there). Having metadata (trailer) at the end of the slots is not really practical solution as it cannot be used to store the image itself anyway. Overhead calculation/estimations:It is quite clear that for fail-safe image swapping it is necessary to have at least a single spare erase sector. Example: let's assume each slot consists of 1024 erase sectors. We divide each slot into 128 chunks to be swapped (8 erase sectors each). But do we really need to have hash table for the both images for successful recovery? Actually not. Swapping chunks are not scattered randomly; we know the swapping algorithm. We just need to find the crossing-point where the swapping was interrupted. Going further, the image typically would not take up the whole slot. The application designer likely leaves at least 10% margin or more. Where to store the hashes:Rather not pre-calculated in the image itself for several reasons:
Hashes in the FLASH:
In RAM:
Summary:
|
Where we are
At this point MCUboot, while moving sectors around, logs status of operations with three flags per sector. With devices with write block size of 4b it amounts to 12 bytes per moved sector. This becomes problematic when you want to make MCUboot work with devices that have bigger write block sizes; for example with devices that require 8 bytes this is already 24 bytes, and becomes 48 for devices with 16 byte write block size. When you get to devices that have blocks like 512 byte, this become hard to handle.
Note: while using "sector" here I mean logical sector of N * erase-block-size of device with the biggest erase-block-size.
What can we do?
The proposal here is to replace the swap log with information block, layout record, that would state desired state of slot and swap algorithm would just try to restore the desired state from what it has on flash.
The layout record could consist of array of sha of sectors in order they should appear in slot. Note that sha256 is 32 byte long so it is already shorter then log entry for 16 byte write-block-size device.
For example
Where
sha
is whatever sha we select (could be configurable, sha256 seems fine), andsectorX
i logical sector size block on device, within slot.Each slot will have its own expected layout record, so there would be two such blocks. The layout record sha is just to make sure that we the record is complete.
Note that, if we put aside header size, the log size is now just N * sizeof(sha) and is aligned to write-block-size as a whole not per log entry (actually it will be aligned to erase block size, or logical sector).
Unresolved questions:
Probably storing logs at the end of slot is not a good idea and we should have reserved space outside of slots and the entire slot should be dedicated to application image; problem will be here with devices that have big erase sizes as the advantage of small log record would be killed by using entire large independent erase block size just to store it, and it is better for such devices to have that record inside slot. This could be probably configurable depending on device layout.
Generally if we just assume that MCUboot always attempts to restore layout according to the pre-defined log, then just appearance of the log would mean that the swap should happen, and confirmation could be just a removal of the log - this is again problem for devices with big erase blocks, so they would have to mark somehow that they wan to stay with tested layout anyway.
Probably, specifically when signed. Also calculating record on device is risky, but should be evaluated.
@nvlsianpu @d3zd3z @nordicjm @butok @erwango
The text was updated successfully, but these errors were encountered: