OpenSBI Bug #1264

rosethompson · 2025-02-06T19:54:37Z

I'm opening a bug to track progress on an issue my FPGA/ImperasDV lockstep emulator discovered.

The bug becomes visible when checking the crc32 of the device tree in the Linux kernel. The length of the crc32 is too long, 0xda5 bytes rather than the correct 0x9a5. Interesting, this passes simulation because our RAMs model un-initialized memory to read 0 and so does ImperasDV. Thus ImperasDV sees agreement between the two and the crc32 computes 0 when the input is 0 so the computation passes. However the FPGA is real hardware and reads a random value past the length of the device tree so ImperasDV mismatches.

I've repeated this on both the old and new (Kernel 6.12.8) versions of buildroot.

OpenSBI copies the length 0x9a5 twice to the same location in memory (PA 0x822004). The first time is the correct value 0x9a5, but the second time is the wrong value 0xda5.

rosethompson · 2025-02-06T21:07:50Z

I found what I think the root cause is. OpenSBI makes two calls to fdt_open_into with the length of the device tree + 32 and then again with the length of the device tree + 1024. This function modifies and moves the device tree to another location but appears to incorrectly either move the wrong number of bytes or set the size of the device tree incorrectly.

The file in question is buildroot/output/build/opensbi-1.6/lib/utils/libfdt/fdt_rw.c. The third argument to the function fdt_open_into is int bufsize. The memory move however is called with a different length newsize which is smaller than bufsize.

memmove(buf, tmp, newsize);

Later in the code the length is set with

fdt_set_totalsize(buf, bufsize);

One of two changes should be made to either set the length to match newsize or have memmove copy the how bufsize. I am testing the first option.

jordancarlin · 2025-02-06T22:32:11Z

That’s interesting and sounds like a bug in the source code for openSBI itself. If so, I wonder why no one else has run into this.

rosethompson · 2025-02-06T22:50:17Z

I wouldn't be surprised if Linux just ignores the crc32 error.

rosethompson · 2025-02-08T17:42:14Z

This definitely fixed the issue and lets the Linux boot continue. I found another issue immediately after. The MCAUSE, MEPC, and several other CSRs are continuously enabled while StallW is asserted during a Trap this causes a problem for the hardware tracer as the state will update immediately and be associated with the previous instruction. I assume this will also happen with wallyTracer.sv. This isn't really a hardware bug in Wally, but would have a small power cost. However given the tracer implication I think we want these changes. Once I've confirmed this will impact wallyTracer I will make a separate pull request.

Linux got to 2.2B instructions. The current problem is the interrupt timer is too fast and boot isn't making any progress. I'll have to adjust the device tree.

davidharrishmc · 2025-02-08T17:44:54Z

Whoo!

jordancarlin · 2025-02-08T17:48:33Z

Great! Should this be opened as an issue against OpenSBI? And until the fix is upstreamed, do we want/need to carry that patch for our build of OpenSBI? Buildroot has strong support for custom patches if so.

rosethompson · 2025-02-08T18:41:35Z

Yes we should do open an issue with OpenSBI. I'm going to be busy for the next few hours, but when I get back I'll separate out the code for OpenSBI and make a separate patch. It's possible there is a better solution than mine as I might not fully understand the intention of their code.

rosethompson · 2025-02-12T16:17:52Z

My solution is much uglier than I like and is probably not the correct solution. ftd_open_into(const void *fdt, void *buf, int bufsize) does not initialize buf and has multiple path ways through the code to set the new size of the device tree. I am just initializing buf with all 0's at the beginning and this corrects the ImperasDV mismatch. However, I doubt this is the solution the OpenSBI maintainers will want.

However, lockstep is now working for billions of instructions! Unfortunately the FPGA kernel panics but lockstep passes.

rosethompson mentioned this issue Feb 12, 2025

Device tree uninitizlied after resized riscv-software-src/opensbi#388

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenSBI Bug #1264

OpenSBI Bug #1264

rosethompson commented Feb 6, 2025

rosethompson commented Feb 6, 2025

jordancarlin commented Feb 6, 2025

rosethompson commented Feb 6, 2025

rosethompson commented Feb 8, 2025

davidharrishmc commented Feb 8, 2025

jordancarlin commented Feb 8, 2025

rosethompson commented Feb 8, 2025

rosethompson commented Feb 12, 2025

OpenSBI Bug #1264

OpenSBI Bug #1264

Comments

rosethompson commented Feb 6, 2025

rosethompson commented Feb 6, 2025

jordancarlin commented Feb 6, 2025

rosethompson commented Feb 6, 2025

rosethompson commented Feb 8, 2025

davidharrishmc commented Feb 8, 2025

jordancarlin commented Feb 8, 2025

rosethompson commented Feb 8, 2025

rosethompson commented Feb 12, 2025