Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG - Not working in release mode #180

Closed
wants to merge 2 commits into from

Conversation

usbalbin
Copy link
Member

@usbalbin usbalbin commented Jan 29, 2025

This short example does not work in release mode:

#[entry]
fn main() -> ! {
    let dp = stm32::Peripherals::take().expect("cannot take peripherals");
    let mut rcc = dp.RCC.constrain();

    let gpioa = dp.GPIOA.split(&mut rcc);
    let _pin: stm32g4xx_hal::gpio::gpioa::PA8<stm32g4xx_hal::gpio::Alternate<AF6>> =
        gpioa.pa8.into_alternate();
    let mut pin2 = gpioa.pa9.into_pull_up_input();
    defmt::println!("hej");
    pin2.is_high().unwrap(); // <--- Booom

    loop {}
}
DEFMT_LOG=trace cargo r --example broken --features stm32g474,defmt,log-rtt -- --chip stm32g474RETx
warning: unused import: `utils::logger::println`
 --> examples/broken.rs:6:5
  |
6 | use utils::logger::println;
  |     ^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: `stm32g4xx-hal` (example "broken") generated 1 warning (run `cargo fix --example "broken"` to apply 1 suggestion)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.37s
     Running `probe-rs run --connect-under-reset --chip=stm32g474RETx target/thumbv7em-none-eabihf/debug/examples/broken`
      Erasing ✔ 100% [####################]  20.00 KiB @  58.10 KiB/s (took 0s)
  Programming ✔ 100% [####################]  19.00 KiB @  42.07 KiB/s (took 0s)                                                                                      Finished in 0.80s
hej
DEFMT_LOG=trace cargo r --example broken --features stm32g474,defmt,log-rtt --release  -- --chip stm32g474RETx
warning: unused import: `utils::logger::println`
 --> examples/broken.rs:6:5
  |
6 | use utils::logger::println;
  |     ^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: `stm32g4xx-hal` (example "broken") generated 1 warning (run `cargo fix --example "broken"` to apply 1 suggestion)
    Finished `release` profile [optimized] target(s) in 0.21s
     Running `probe-rs run --connect-under-reset --chip=stm32g474RETx target/thumbv7em-none-eabihf/release/examples/broken`
 WARN probe_rs::util::rtt: Insufficient DWARF info; compile your program with `debug = 2` to enable location info.
      Erasing ✔ 100% [####################]   8.00 KiB @  51.94 KiB/s (took 0s)
  Programming ✔ 100% [####################]   7.00 KiB @  37.89 KiB/s (took 0s)                                                                                      Finished in 0.34s
 WARN probe_rs::probe::stlink: send_jtag_command 242 failed: SwdDpError
 WARN probe_rs::probe::stlink: send_jtag_command 242 failed: SwdDpError
 WARN probe_rs::session: Could not clear all hardware breakpoints: An ARM specific error occurred.

Caused by:
    0: The debug probe encountered an error.
    1: An error which is specific to the debug probe in use occurred.
    2: Command failed with status SwdDpError.
 WARN probe_rs::probe::stlink: send_jtag_command 242 failed: SwdDpError
 WARN probe_rs::session: Failed to deconfigure device during shutdown: Arm(Probe(ProbeSpecific(BoxedProbeError(CommandFailed(SwdDpError)))))
Error: An ARM specific error occurred.

Caused by:
    0: The debug probe encountered an error.
    1: An error which is specific to the debug probe in use occurred.
    2: Command failed with status SwdDpError.

Tested on a Nucleo-G474RE

$ rustc --version
rustc 1.83.0 (90b35a623 2024-11-26)

@usbalbin
Copy link
Member Author

Removing any one of the three statements

  • let _pin: stm32g4xx_hal::gpio::gpioa::PA8<stm32g4xx_hal::gpio::Alternate<AF6>> = gpioa.pa8.into_alternate();
  • defmt::println!("hej");
  • pin2.is_high().unwrap(); // <--- Booom

and there is no crash

@usbalbin
Copy link
Member Author

usbalbin commented Jan 29, 2025

Same thing when based on main

Also, same thing when based on main and using rustc 1.84

@usbalbin
Copy link
Member Author

@burrbull @AdinAck do you have any ideas or recommendations?

@burrbull
Copy link
Member

Some optimization bug maybe? I don't know. Try opt-level = "s".

Or some issue with defmt.

What happen if pin2.is_high().unwrap() replace with pin2.is_high().ok() ?

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

I just ran it resulting in the following output:

    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.17s
     Running `probe-rs run --connect-under-reset target/thumbv7em-none-eabihf/debug/examples/broken --chip stm32g474retx`
      Erasing ✔ 100% [####################]  20.00 KiB @  56.14 KiB/s (took 0s)
  Programming ✔ 100% [####################]  19.00 KiB @  23.67 KiB/s (took 1s)                                                                                                                                                                                                                                                 Finished in 0.81s
hej

I'm on Rust 1.84.0.

(I have the exact same nucleo)

Why is my binary so much bigger than yours?

Edit, same behavior as you in release mode :(

@usbalbin
Copy link
Member Author

usbalbin commented Jan 30, 2025

@burrbull

Thanks.

Same problem with ok

- pin2.is_high().unwrap(); // <--- Booom
+ pin2.is_high().ok(); // <--- Booom

also same thing with opt-level = "s" under [profile.release] when running in release mode

@usbalbin
Copy link
Member Author

I just ran it resulting in the following output:

    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.17s
     Running `probe-rs run --connect-under-reset target/thumbv7em-none-eabihf/debug/examples/broken --chip stm32g474retx`
      Erasing ✔ 100% [####################]  20.00 KiB @  56.14 KiB/s (took 0s)
  Programming ✔ 100% [####################]  19.00 KiB @  23.67 KiB/s (took 1s)                                                                                                                                                                                                                                                 Finished in 0.81s
hej

I'm on Rust 1.84.0.

(I have the exact same nucleo)

Why is my binary so much bigger than yours?

Have you tried in release mode? For me this works fine in non-release mode

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

Yeah my bad, I'm going to try to compare the ASM but that might be a futile endeavor.

@usbalbin
Copy link
Member Author

Yeah my bad, I'm going to try to compare the ASM but that might be a futile endeavor.

No worries. That would be awesome! I'll try to replace some of the hal calls with raw pointer and see if I can see any changes. This feels a bit scary

@usbalbin
Copy link
Member Author

Just found that this

#[entry]
fn main() -> ! {
    let dp = stm32::Peripherals::take().expect("cannot take peripherals");
    let mut rcc = dp.RCC.constrain();

    let _gpioa = dp.GPIOA.split(&mut rcc);

    unsafe {
        let afrh_ptr = 0x4800_0024 as *mut u32;
        let value = afrh_ptr.read_volatile();
        defmt::println!("{}", value);
        afrh_ptr.write_volatile(value);
    }

    defmt::println!("hej");
    unsafe {
        (0x4800_0010 as *const u32).read_volatile();
    }
    loop {}
}

works

but not this

          let value = afrh_ptr.read_volatile();
-         defmt::println!("{}", value);
          afrh_ptr.write_volatile(value);

@usbalbin
Copy link
Member Author

And then again this works

-         let value = afrh_ptr.read_volatile();
-         afrh_ptr.write_volatile(value);
+         let _value = afrh_ptr.read_volatile(); // <- dont use read value(which sould be 0)
+         afrh_ptr.write_volatile(0); // Write 0 which according to RM0440 is the reset value(so no change in theory)

@burrbull
Copy link
Member

burrbull commented Jan 30, 2025

Add peripheral reset.

rcc.rb.ahb2enr.modify(|_, w| w.$iopxenr().set_bit());

https://github.com/stm32-rs/stm32f4xx-hal/blob/0321245c9c34932d9ab0645b7da3540febb7bd87/src/gpio.rs#L585-L590

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

works

Hm, this does not work for me. Same issue.

@usbalbin
Copy link
Member Author

usbalbin commented Jan 30, 2025

Further trimmed down, still the problem

#[entry]
fn main() -> ! {
    unsafe {
        (*stm32::RCC::ptr())
            .ahb2enr()
            .modify(|_, w| w.gpioaen().set_bit());

        let afrh_ptr = 0x4800_0024 as *mut u32;
        let value = afrh_ptr.read_volatile();
        afrh_ptr.write_volatile(value);

        defmt::println!("hej");

        (0x4800_0010 as *const u32).read_volatile(); // IDR input data register
    }
    loop {}
}

This works for me

#[entry]
fn main() -> ! {
    unsafe {
        (*stm32::RCC::ptr())
            .ahb2enr()
            .modify(|_, w| w.gpioaen().set_bit());

        let afrh_ptr = 0x4800_0024 as *mut u32;
        let value = afrh_ptr.read_volatile();
        afrh_ptr.write_volatile(0); // <------------------------------------- This is changed to 0

        defmt::println!("hej");

        (0x4800_0010 as *const u32).read_volatile();
    }
    loop {}
}

@usbalbin
Copy link
Member Author

using cargo asm:

$ DEFMT_LOG=trace cargo asm --example broken --features stm32g474,defmt,log-rtt --release

[...]
12 "_defmt_write" [174]
13 "broken::__cortex_m_rt_main" [86]
14 "core::fmt::Formatter::pad_integral::write_prefix" [61]
[...]

then running DEFMT_LOG=trace cargo asm --example broken --features stm32g474,defmt,log-rtt --release 13

with value and 0

$ diff value.asm zero.asm
24a25
> 	movs r0, #0
26,27c27,28
< 	movw r0, :lower16:"{\"package\":\"stm32g4xx-hal\",\"tag\":\"defmt_println\",\"data\":\"hej\",\"disambiguator\":\"9093241664256461226\",\"crate_name\":\"broken\"}"
< 	movt r0, :upper16:"{\"package\":\"stm32g4xx-hal\",\"tag\":\"defmt_println\",\"data\":\"hej\",\"disambiguator\":\"9093241664256461226\",\"crate_name\":\"broken\"}"
---
> 	movw r0, :lower16:"{\"package\":\"stm32g4xx-hal\",\"tag\":\"defmt_println\",\"data\":\"hej\",\"disambiguator\":\"7588874015957968919\",\"crate_name\":\"broken\"}"
> 	movt r0, :upper16:"{\"package\":\"stm32g4xx-hal\",\"tag\":\"defmt_println\",\"data\":\"hej\",\"disambiguator\":\"7588874015957968919\",\"crate_name\":\"broken\"}"

@burrbull
Copy link
Member

#[inline(always)]
fn reset(rcc: &RccRB) {
unsafe {
bb::set(Self::Bus::rstr(rcc), $bit);
bb::clear(Self::Bus::rstr(rcc), $bit);
}
}
}

Set bit and then reset.

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

I can confirm the fault occurs upon writing to afrh:

Process 1 stopped
* thread #1, stop reason = step over
    frame #0: 0x080002b8 broken`broken::__cortex_m_rt_main::hd2b052cc162dd424 at broken.rs:27:18
   24  	        let afrh_ptr = 0x4800_0024 as *mut u32;
   25  	        let value = afrh_ptr.read_volatile();
   26  	        defmt::println!("{}", value);
-> 27  	        afrh_ptr.write_volatile(value);
   28  	    }
   29  	
   30  	    defmt::println!("hej");
Target 0: (broken) stopped.
(lldb) n
Process 1 exited with status = -1 (0xffffffff) lost connection

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

Wait... SWDIO and SWDCLK are AF0. Are you somehow accidentally changing that when you write a non-zero value to afrh?

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

Ah, yes, you are.

In release mode, you need to wait to read the GPIO register to allow the AHB bus to propagate the RCC change enabling GPIOA. Notice the value read is actually the MODER register because that's what the peripheral does when it is not enabled. (10101011111111111111111111111111).

Then by the time you write it to AFRH, you override the SWDIO and SWDCLK alternate functions, killing RTT.

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

You can avoid this with a cortex_m::asm::delay(1) because the AHB bus takes one clock cycle to propagate changes.

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

Proof:

#[entry]
fn main() -> ! {
    let dp = stm32::Peripherals::take().expect("cannot take peripherals");
    let mut rcc = dp.RCC.constrain();

    let gpioa = dp.GPIOA.split(&mut rcc);

    cortex_m::asm::delay(1);

    let _pin: stm32g4xx_hal::gpio::gpioa::PA8<stm32g4xx_hal::gpio::Alternate<AF6>> =
        gpioa.pa8.into_alternate();
    let mut pin2 = gpioa.pa9.into_pull_up_input();
    defmt::println!("hej");
    pin2.is_high().unwrap(); // <--- Booom

    loop {}
}

@usbalbin
Copy link
Member Author

usbalbin commented Jan 30, 2025

#[inline(always)]
fn reset(rcc: &RccRB) {
unsafe {
bb::set(Self::Bus::rstr(rcc), $bit);
bb::clear(Self::Bus::rstr(rcc), $bit);
}
}
}

Set bit and then reset.

Like this

#[entry]
fn main() -> ! {
    unsafe {
        (*stm32::RCC::ptr())
            .ahb2enr()
            .modify(|_, w| w.gpioaen().set_bit());// Enable
        (*stm32::RCC::ptr()).ahb2rstr().modify(|_, w| w.gpioarst().set_bit());
        (*stm32::RCC::ptr()).ahb2rstr().modify(|_, w| w.gpioarst().clear_bit());

        let afrh_ptr = 0x4800_0024 as *mut u32;
        let value = afrh_ptr.read_volatile();
        afrh_ptr.write_volatile(value);

        defmt::println!("hej");

        (0x4800_0010 as *const u32).read_volatile();
    }
    loop {}
}

this appears to work!

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

Do those register writes lock the core until the AHB bus flushes or do they happen to take at least one clock cycle?

@usbalbin
Copy link
Member Author

@burrbull is this how it should be ed2504b ? Seems to be that way for the other peripherals I have checked so far

$TIMX::reset(rcc_ptr);
$USARTX::reset(rcc_ptr);
$SPIX::reset(rcc_ptr);
DAC::reset(rcc_ptr);

@usbalbin
Copy link
Member Author

With ed2504b the original example seems to work again!

@usbalbin
Copy link
Member Author

From the RM

7.2.17

Peripheral clock enable register
(RCC_AHBxENR, RCC_APBxENRy)
Each peripheral clock can be enabled by the xxxxEN bit of the RCC_AHBxENR,
RCC_APBxENRy registers.
When the peripheral clock is not active, the peripheral registers read or write accesses are
not supported.
The enable bit has a synchronization mechanism to create a glitch free clock for the
peripheral. After the enable bit is set, there is a 2 clock cycles delay before the clock be
active.
Caution: Just after enabling the clock for a peripheral, software must wait for a delay before
accessing the peripheral registers.

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

Huh, I always thought it was one cycle, I guess I've been getting lucky with 1 cycle delay.

@usbalbin
Copy link
Member Author

I can not find any information on the need for the rcc reset after the enable. Is the reset needed for the reset itself or for the delay it happens to provide?

@AdinAck
Copy link
Contributor

AdinAck commented Jan 30, 2025

I'm not sure if you saw what I did here:

Proof:

#[entry]
fn main() -> ! {
    let dp = stm32::Peripherals::take().expect("cannot take peripherals");
    let mut rcc = dp.RCC.constrain();

    let gpioa = dp.GPIOA.split(&mut rcc);

    cortex_m::asm::delay(1);

    let _pin: stm32g4xx_hal::gpio::gpioa::PA8<stm32g4xx_hal::gpio::Alternate<AF6>> =
        gpioa.pa8.into_alternate();
    let mut pin2 = gpioa.pa9.into_pull_up_input();
    defmt::println!("hej");
    pin2.is_high().unwrap(); // <--- Booom

    loop {}
}

This is what I do in proto-hal and it works fine, I suspect the reset is simply a way to do the delay. But maybe I don't know something.

@usbalbin
Copy link
Member Author

@AdinAck Sorry did not see #180 (comment) interesting.

Thanks a lot for the help both of you @AdinAck and @burrbull !

@usbalbin usbalbin closed this Jan 31, 2025
@burrbull
Copy link
Member

burrbull commented Feb 1, 2025

Set bit and then reset.

I've opened G4 RM and now I'm not sure. Looks like it is different from how F4 works.

@AdinAck
Copy link
Contributor

AdinAck commented Feb 1, 2025

Just some more evidence. I promised an ASM analysis, so here you go:

It was pure coincidence that commenting out some of the defmt lines caused a subtle rearrangement of operations such that the gpioa::afrh register read occurred earlier.

Erroneous ASM (from original code in this PR):

movw r0, #4172    ; r0 is becoming the address of rcc::ahb2enr
movs r4, #4       ; r4 is becoming the address of gpioa::afrh
movt r0, #16386   ; (16386 << 16) | 4172 = 0x4002_104c (rcc base address is 0x4002_1000 + ahb2enr offset of 0x4c, gpioaen is bit 0)
movt r4, #18432   ; (18432 << 16) |    4 = 0x4800_0004 (gpioa base address is 0x4800_0000 i have no idea why it added 4)
ldr r1, [r0]      ; read rcc::ahb2enr
movs r2, #2
orr r1, r1, #1    ; set gpioaen bit
str r1, [r0]      ; write rcc::ahb2enr
ldr r0, [r4, #32] ; read gpioa::afrh (one instruction later!) (also this offset of 32 ends up at the correct address for gpioa::afrh)

Functional ASM (original code with println commented out):

movw r0, #4172
movs r2, #6
movt r0, #16386
movs r3, #2
ldr r1, [r0]
orr r1, r1, #1
str r1, [r0]      ; write rcc::ahb2enr
movs r0, #4
movt r0, #18432
ldr r1, [r0, #32] ; read gpioa::afrh

Now there are two instructions in between the peripheral enable and peripheral read, sufficient time. The compiler just deferred the construction of the gpioa::afrh address to just before the read, rather than before the write to rcc::ahb2enr.

@AdinAck
Copy link
Contributor

AdinAck commented Feb 1, 2025

We know cortex_m::asm::delay(2) works, but it may be a bit slower than necessary, it also does a whole function call. I experimented with asm!("nop") and it seems to result in the compiler moving related instructions to be closer together.

This is almost certainly not a deterministic compiler feature, and just so happens to be how it behaves currently. But is still interesting nonetheless. It would be cool if the compiler was able to reason about the number of cycles between operations, but I imagine pipelining and caching and such would make this infeasible.

usbalbin added a commit that referenced this pull request Feb 13, 2025
Rcc enable *and* reset gpio in split method.

Calling `reset` as well ensures that enough time has passed between `enable` and the first register access to the peripheral. Failing to do this might lead to very hard to debug issues, see #180.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants