Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for FDCAN peripheral as found on newer STM32 micros. #2475

Merged
merged 6 commits into from
Jan 30, 2024

Conversation

cschuhen
Copy link

Discussion tracked in:
#2447

This has been a group effort.

@cschuhen cschuhen force-pushed the feature/fdcan_r2 branch 3 times, most recently from 1c1831e to 8d50ae2 Compare January 24, 2024 21:19
@blaa
Copy link

blaa commented Jan 24, 2024

Commit reorganization I'd consider:

  • merging stm32h5 and h7 examples to a one commit.
  • Merge changes from formatting to documentation or original commits.
  • Merge the revert example chip back to Example working on real 250K bus with STM32G431CB. so there's no back-and-forth changes.

@blaa
Copy link

blaa commented Jan 24, 2024

Question I was asking on #Embassy was whether that should be behind a unstable-fdcan feature or if the embassy is unstable enough to not bother with it. I believe the API might change, but it's not particularly extensive.

@cschuhen
Copy link
Author

Commit reorganization I'd consider:

  • merging stm32h5 and h7 examples to a one commit.
  • Merge changes from formatting to documentation or original commits.
  • Merge the revert example chip back to Example working on real 250K bus with STM32G431CB. so there's no back-and-forth changes.

Yes, I was thinking the same. I just ran out of time before work and needed to read up on how to split commits in rebase. I usually use mercurial rather than git.

@cschuhen cschuhen force-pushed the feature/fdcan_r2 branch 2 times, most recently from f9138a3 to 71e3959 Compare January 25, 2024 09:22
@cschuhen cschuhen force-pushed the feature/fdcan_r2 branch 3 times, most recently from 81b5aa5 to af44a5c Compare January 26, 2024 06:58
@Dirbaio
Copy link
Member

Dirbaio commented Jan 27, 2024

bender run

@Dirbaio
Copy link
Member

Dirbaio commented Jan 27, 2024

Looks great overall. and HIL test is passing! :D

It'd be great to extend the HIL test to other chips. Ideally all chips in the HIL test farm that have FDCAN. I can give teleprobe access to anyone interested, so you can run the tests locally without having to wait for CI. Please ping me on Matrix :)


Also, This is more of a general observation than a review comment but: I wonder if long-term it would be better to maintain the bxcan+fdcan drivers in-crate. There's a few negative consequences of using the bxcan/fdcan crates:

  • The fact that fdcan needs different Cargo features per chip. It's not nice to not be able to use the stm32-metapac metadata versions to automatically wire it. Also, we should try to keep Cargo.toml small, because every feature dependency bloats the crates.io index, which recently turned out to cause scaling problems, prompting the crates.io team to set a limit of 300 features. We have ~1500 features, we've been granted an exception but i'd like not to abuse it.
  • The OperatingMode typestates go against the embassy-stm32 api design principles (reduce generics as much as possible), but we can't avoid them without modifying fdcan.
  • The tx/rx halves require 2 lifetime params, while if we rolled our own we could have just one.
  • The bxcan crate seems unmaintained (last commit 2y ago, and jonas-schievink has very publicly exited the Rust scene)

I'm just wondering long term, i'm not saying we should do in-crate CAN drivers now, it'd be a lot of work, i'm OK with these limitations for now, and fdcan gets us working fdcan support now, which is nice.

@cschuhen
Copy link
Author

@Dirbaio Thanks! I appreciate your time reviewing.

Interesting comments on use of fdcan. This does sound like a decent bit of work. My mind is wondering about forking fdcan driver by copy-pasting it into embassy-stm32 and modify from there. Still might mean a fair bit of rework here now to get any reasonable use but do you think that is a reasonable approach? Or fork fdcan crate to make one that works better with stm32-metapac.

@cschuhen
Copy link
Author

I certainly followed your issues with the feature limit. Really, itas a NOOB looking in my first thought is the feature system is broken(or limited). Shouldn't it allow chip selection to use just one feature CHIP=VALUE? Why do features have to be boolean.

I do wonder about the whole split thing... Longer term, I really think we need a buffered fdcan. Especially for read, dropping RX packets is really bad for most use of CAN so we really need the interrupt routine to putting things into a buffer. Often, this is a requirement for TX too.

Could we have internal RX(backed by Channel or PubSubChannel) and TX(backed by Channel) buffers and just hand out senders for TX and receivers for RX? Would this do away with the need for split at all? My biggest issue is specifying the size of buffers, this would either add to features(the way arena size does) or add generic arguments.

@cschuhen cschuhen force-pushed the feature/fdcan_r2 branch 4 times, most recently from 1f52809 to 0890049 Compare January 27, 2024 06:23
@cschuhen
Copy link
Author

cschuhen commented Jan 27, 2024

Okay, so I have pushed updates to the PR:

  1. Remove RefCell
  2. Change Cargo.toml as requested
  3. Includes these that I held off on including yesterday:
    a) Include Embassy timestamps
    b) Read from fifo1 also
    I have left all changes to fdcan.rs as separate commits so you can follow changes after reviewing once. I can still squash them down when you want.

@cschuhen cschuhen force-pushed the feature/fdcan_r2 branch 2 times, most recently from 021da9a to 2bf1d2e Compare January 27, 2024 08:37
@kevswims
Copy link
Contributor

@Dirbaio

Also, This is more of a general observation than a review comment but: I wonder if long-term it would be better to maintain the bxcan+fdcan drivers in-crate. There's a few negative consequences of using the bxcan/fdcan crates:

Agreed that this isn't a problem to solve today but I really like this idea. The drivers for the CAN peripherals are not really that complex and this would make it easier to have a consistent API across the different hardware.

@cschuhen
Copy link
Author

@Dirbaio

Also, This is more of a general observation than a review comment but: I wonder if long-term it would be better to maintain the bxcan+fdcan drivers in-crate. There's a few negative consequences of using the bxcan/fdcan crates:

Agreed that this isn't a problem to solve today but I really like this idea. The drivers for the CAN peripherals are not really that complex and this would make it easier to have a consistent API across the different hardware.

Not to mention that the fdcan crate does not use what it can(excuse pun) from embedded hal like the ID definitions.

@blaa
Copy link

blaa commented Jan 29, 2024

I tested the code during the weekend and in general - it works. I'd not block the merge, but plan for some changes shortly. I'd assume the API not yet stable. ;) Unsure if it's worth a feature-gate.

Mostly:

  1. Get API parity with bxcan (try_write, flush_any, try_read, wait_not_empty)
  2. Add API for adding CAN message filters.
  3. Probably add some error counters that are available through the API. Buffer Overrun for example could be counted - even with additional dedicated buffer overrun can happen. This is simple enough and uses not much additional RAM.
  4. Maybe splitting TX and RX between two separate interrupts?
  5. I found a bug where write().await can block indefinitely: With high contention on BUS with the same CAN ID messages - so a lot of arbitration error. The HW can quit the normal mode and enter BusOff state. Then no further interrupts will push the transmit forward and it will just hang. Below is a patch that could fix that, but we can add it later.
@@ -154,6 +157,21 @@ impl<T: Instance> interrupt::typelevel::Handler<T::IT0Interrupt> for IT0Interrup
 
         let ir = regs.ir().read();
 
+        if ir.bo() {
+            /* We entered/left busoff state. This would block the awaited TX forever. Force retry. */
+            regs.ir().write(|w| w.set_bo(true));
+
+            let cccr = regs.cccr().read();
+            if cccr.init() {
+                // BusOff - so many TX errors that interface got detached.
+                regs.cccr().write(|w| w.set_init(false));
+            }
+            //return;
+        }

(...)

@@ -323,11 +343,15 @@ impl<'d, T: Instance> Fdcan<'d, T, fdcan::ConfigMode> {
         can.enable_interrupt(fdcan::interrupt::Interrupt::RxFifo0NewMsg);
         can.enable_interrupt(fdcan::interrupt::Interrupt::RxFifo1NewMsg);
         can.enable_interrupt(fdcan::interrupt::Interrupt::TxComplete);
+        can.enable_interrupt(fdcan::interrupt::Interrupt::BusOff);

This will cause the HW to wait for 129 Bus Idle states and then retry transmissions. RM0440 entry:

The Bus_Off recovery sequence (see CAN Specification Rev. 2.0 or ISO11898-1) cannot be shortened by setting or resetting CCCR[INIT]. If the device goes Bus_Off, it sets CCCR.INIT of its own, stopping all bus activities. Once CCCR[INIT] has been cleared by the CPU, the device then waits for 129 occurrences of Bus Idle (129 × 11 consecutive recessive bits) before resuming normal operation. At the end of the Bus_Off recovery sequence, the Error Management Counters are reset. During the waiting time after the reset of CCCR[INIT], each time a sequence of 11 recessive bits has been monitored, a Bit0 Error code is written to PSR[LEC], enabling the CPU to readily check up whether the CAN bus is stuck at dominant or continuously disturbed and to monitor the Bus_Off recovery sequence. ECR[REC] is used to count these sequences.

@cschuhen cschuhen force-pushed the feature/fdcan_r2 branch 6 times, most recently from c8c162c to b276b94 Compare January 30, 2024 12:25
Tomasz bla Fortuna and others added 5 commits January 31, 2024 05:40
Author: Torin Cooper-Bennun <[email protected]>

Change from review.
Author: Adam Morgan <[email protected]>

Break definitions out of bxcan that can be used innm fdcan.

Typo
Original author:
    Torin Cooper-Bennun <[email protected]>

Cleanup and documentaion by:
    Tomasz bla Fortuna <[email protected]>
    Corey Schuhen <[email protected]>

Use new PAC method now that the names are common.

Use broken out definitions that can be shared with bxcan

Populate Rx struct with an embassy timestamp.

Remove use of RefCell.

As per review comment. - THis will probably get squashed down.

Fix
Fix examples

Fix examples

Fix examples.
Internal loopback.

fdcan: use common.rs for HIL test.

Fix tests.

Fix tests.

Fix tests

Add HIL tests for H7 even though they are a bit crippled.

CI fixes

Bah

Test

bah
@cschuhen
Copy link
Author

I tested the code during the weekend and in general - it works. I'd not block the merge, but plan for some changes shortly. I'd assume the API not yet stable. ;) Unsure if it's worth a feature-gate.

Mostly:

  1. Get API parity with bxcan (try_write, flush_any, try_read, wait_not_empty)
  2. Add API for adding CAN message filters.
  3. Probably add some error counters that are available through the API. Buffer Overrun for example could be counted - even with additional dedicated buffer overrun can happen. This is simple enough and uses not much additional RAM.
  4. Maybe splitting TX and RX between two separate interrupts?
  5. I found a bug where write().await can block indefinitely: With high contention on BUS with the same CAN ID messages - so a lot of arbitration error. The HW can quit the normal mode and enter BusOff state. Then no further interrupts will push the transmit forward and it will just hang. Below is a patch that could fix that, but we can add it later.

These are my thoughts:

  1. Yes, they can be added later because they shouldn't be 'breaking changes' we are just adding additional API. Users could probably emulate something similar by creating small reader and writer tasks that are backed by Channel.
  2. Can't this already be done using the fdcan create API directly? I kind of touched on some of this in the HIL test. I think to take this a step further we need to analise if we can have a consistent API between fdcan and bxcan.
  3. Yes, it would be good to find a way to notify of this too. We could add it to BusError and return one of those when detected. I'm open to other ideas.
  4. Yes, no issues with that. It should be an internal change.
  5. Yes, we can keep investigating this further as a bug. I think we need to return an error when we get into BusOff state for both readers and writers. If we do a 'restart' like your patch, I think it should be a config option that users can turn off.

@cschuhen
Copy link
Author

Hi @Dirbaio. @blaa and I have discussed and we believe the patch is ready to merge from our point of view.

Copy link
Member

@Dirbaio Dirbaio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's merge then. Thank you everyone who helped make this happen, you're awesome ❤️ 🚀

@Dirbaio Dirbaio added this pull request to the merge queue Jan 30, 2024
Merged via the queue into embassy-rs:main with commit fed8635 Jan 30, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants