USB methods disable interrupts for too long #397

ianrrees · 2021-02-01T23:24:40Z

I'm working on a SAMD21 based USB device, which does the USB work in the USB ISR, along the lines of the usb_echo example. The USB interrupt is set to a lower priority (higher number) than another interrupt which needs to be serviced fairly quickly, but sometimes that doesn't happen quickly enough. The issue seems to be that UsbBus's methods do all their work in interrupt free contexts. For example, these.

Might it make sense for UsbBus to have an "in use" flag in the mutex rather than the RefCell, so that the interrupts only need to be disabled for setting/checking that flag?

The text was updated successfully, but these errors were encountered:

twitchyliquid64 · 2021-02-01T23:41:32Z

Can you elaborate on 'too long'? We tried to do as little as possible in the UsbBus inner implementation.

The USB interrupt is set to a lower priority (higher number) than another interrupt which needs to be serviced fairly quickly

Depending on how many priority bits your chip has, could you boop the USB to have less priority than your other CSR?

Might it make sense for UsbBus to have an "in use" flag in the mutex rather than the RefCell, so that the interrupts only need to be disabled for setting/checking that flag?

I'm not sure i follow, can you give me an example?

ianrrees · 2021-02-02T02:48:25Z

Can you elaborate on 'too long'?

Maybe more importantly than the specific timing constraint, I was surprised to find that the entirety of read() and write() is in an interrupt free block since they call copy_to_nonoverlapping(). Isochronous endpoints can have up to 1023B per read/write, so there would be a pretty wide range of time that interrupts will be disabled.

Specifically, I've got about a 16us (ed: at 48MHz) window to respond to a DMA interrupt, and I believe that response usually takes about 3us so I should be able to tolerate about 12us of interrupts being disabled. Admittedly I haven't put much effort in to profiling - an early test showed a case where the DMA interrupt wasn't handled when the processor was in the USB ISR, so I went digging to see how that could happen...

could you boop the USB to have less priority than your other CSR

Yep, I'm pretty sure that is the current arrangement, but perhaps got something wrong. Will investigate further.

I'm not sure i follow, can you give me an example?

Sorry, what I wrote was a bit unclear, and thinking about it more I may be misunderstanding the point of the mutex. Will try to catch you on matrix to discuss, but for completeness I was imagining adding an AtomicBool field to UsbBus to indicate that work is being done on the RefCell contents. It might look something like below, where swap is along these lines:

fn write(&self, ep: EndpointAddress, buf: &[u8]) -> UsbResult<usize> {
    if swap(self.busy_flag, true) {
        // self.inner was already in use
        Err(ComputerSaysNo)

    } else {
        // We have atomically set busy_flag, so are the only user of self.inner

        // write() proceeds with interrupts enabled
        let result = self.inner.borrow().write(ep, buf);

        // release the flag
        swap(self.busy_flag, false);

        result
    }
}

twitchyliquid64 · 2021-02-02T04:54:16Z

Ooooh I have an idea. Instead of turning off interrupts globally, what if we use mask and unmask to just turn off the IRQ lines for USB (USB_OTHER, USB_TRCPT0, and USB_TRCPT1on samd5x, just one on d21 iirc).

That way we can be happily preempted by more urgent interrupts while servicing, as long as they dont take too long that we miss a frame.

twitchyliquid64 · 2021-02-02T05:09:14Z

Something like this maybe:

struct UsbCS {
    other_active: bool, // USB_OTHER irq
    trcpt0_active: bool, // USB_TRCPT0 irq
    trcpt1_active: bool, // USB_TRCPT1 irq
}

impl core::ops::Drop for UsbCS {
    fn drop(&mut self) {
        use cortex_m::peripheral::NVIC;
        use crate::target_device::interrupt;
        unsafe {
            if self.other_active {
                NVIC::unmask(interrupt::USB_OTHER);
            }
            if self.trcpt0_active {
                NVIC::unmask(interrupt::USB_TRCPT0);
            }
            if self.trcpt1_active {
                NVIC::unmask(interrupt::USB_TRCPT1);
            }
        }
    }
}

impl UsbCS {
    fn mutex_cs(&self) -> cortex_m::interrupt::CriticalSection {
        unsafe { cortex_m::interrupt::CriticalSection::new() }
    }
}

/// Disables IRQ lines related to USB, returning a token that resets the
/// masking state when dropped.
fn usb_critical_section() -> UsbCS {
    use cortex_m::peripheral::NVIC;
    use crate::target_device::interrupt;

    let cs = UsbCS {
        other_active: NVIC::is_enabled(interrupt::USB_OTHER),
        trcpt0_active: NVIC::is_enabled(interrupt::USB_TRCPT0),
        trcpt1_active: NVIC::is_enabled(interrupt::USB_TRCPT1),
    };

    NVIC::mask(interrupt::USB_OTHER);
    NVIC::mask(interrupt::USB_TRCPT0);
    NVIC::mask(interrupt::USB_TRCPT1);
    cs
}

ianrrees · 2021-02-02T22:05:15Z

That way we can be happily preempted by more urgent interrupts while servicing, as long as they dont take too long that we miss a frame.

Is this the reason we're currently disabling interrupts? If so, it probably hasn't been working well in uses like this

#[interrupt]
fn USB() {
    usb_bus.poll(&mut [a_usb_device]);

    // this is a window for other interrupts

    a_usb_device.do_some_reads_and_writes();  // likely contains windows for other interrupts
}

I guess we'll want a method along the lines of interrupt_free() that uses a usb_critical_section, maybe that should be in UsbBus?

#[interrupt]
fn USB() {
    usb_bus.usb_interrupt_free(|cs| {
        // Would poll(), read(), write(), etc need a reference to cs, the UsbCS?
        usb_bus.poll(&mut [a_usb_device]);
        a_usb_device.do_some_reads_and_writes();
    });
}

twitchyliquid64 · 2021-02-02T22:30:41Z

Is this the reason we're currently disabling interrupts?

I think it was just to prevent us being interrupted by a different USB interrupt while servicing a USB interrupt.
I don't think there is anything wrong with being preempted by a higher priority interrupt as long as its not USB-related, but theres only one way to find out xD

ianrrees · 2021-02-04T20:57:06Z

That way we can be happily preempted by more urgent interrupts while servicing, as long as they dont take too long that we miss a frame.

Thinking about this more - I don't know if it's really what is needed. Imagine there's a GPIO interrupt that triggers a USB write - we wouldn't want it to stomp on an in-progress USB write...

I've just yanked out the interrupt::free(), Mutex, and RefCell - it seems to work this state on SAMD21 (which only has one USB interrupt) without doing any masking, so gets around my immediate problem. This is obviously sketchy though, there still needs to be a mechanism to only allow one "thread" to access the UsbBus at a time.

Since the methods where we had interrupt::free() are implementing usb-device, I guess we can't make them require a CriticalSection. In case of contention, should the second call to a UsbBus method block, or return an error - maybe InvalidState?

twitchyliquid64 · 2021-02-04T22:07:34Z

I've just yanked out the interrupt::free(), Mutex, and RefCell - it seems to work this state on SAMD21 (which only has one USB interrupt) without doing any masking, so gets around my immediate problem

Can you share the diff somewhere? Im curious how you did this

ianrrees · 2021-02-04T22:12:47Z

Sure, ianrrees@b129876 and its parent. It's not pretty :).

(edit: updated link)

ianrrees · 2021-02-05T02:37:33Z

seems to work ... without doing any masking

This was premature, it definitely needs a mechanism to prevent USB interrupts while already processing one.

ianrrees · 2021-02-08T21:29:31Z

I think there's a fundamental issue here, but feel a bit out of my depth.

Imagine that we have an implementation of mutex that allows work on the inner struct without disabling interrupts for the duration of that work. When a method tries to access the inner struct through that mutex, the mutex will either have to block or return an error. Since we don't have an OS, the blocking option doesn't make sense (the interrupted code that holds the lock won't make progress until the interrupting code is done), so the locking attempt must return an error. But, we don't have a way to get that error out of several methods required by usb_device that need exclusive access to the inner struct, for instance reset().

A secondary question is how to handle an interrupt that fails to get the lock.

ianrrees · 2021-03-11T03:32:42Z

Sorry for my lack of follow-through on this, the above hack is working well enough in the meantime ;).

This was premature, it definitely needs a mechanism to prevent USB interrupts while already processing one.

Can't remember what prompted me to write this, but think it might be incorrect; AFAICT any particular ISR on SAMD21 won't ever interrupt itself.

I'm becoming convinced that although the mutex in the UsbBus implementation may be necessary in the current implementation of usb-device, with some hopefully-straightforward changes to usb-device (see rust-embedded-community/usb-device#9) it won't be. Currently, a typical use-case involves multiple structs that refer to the same UsbBus, but if we instead had one struct that owned all the USB stuff, then Rust's ownership guarantees should be sufficient to ensure that only one execution context can access it at a time.

There might still be a mutex required to mediate access to the USB stuff, but it would be easier to implement one that only blocks interrupts for a few instructions, for example.

ianrrees · 2021-03-23T21:37:50Z

dirbaio on Matrix suggested that the stm32-usbd doesn't have this particular issue, perhaps it's possible to make UsbBus and/or Inner not by Sync, so that the user either picks their own mutex or only accesses the USB from one particular context.

bradleyharden · 2021-06-14T23:35:15Z

@ianrrees, did you settle on a solution here? Does this need a change to the HAL or a change upstream?

ianrrees · 2021-06-15T01:17:14Z

Currently I'm using a branch that removes the mutex/critical section (CS), it works but isn't ideal. Some possible ways we could resolve this, all involve changes to the HAL:

usb-device changes along the lines of point 2 at Figure out a better way to keep track of the list of classes rust-embedded-community/usb-device#9 , which I think would then mean the HAL implementation wouldn't need the CS to begin with.
It might be possible to push the CS down closer to the hardware, so that the external API stays the same but there are only a couple instructions at a time in a CS. From memory, this would be a fairly involved refactoring.
Make the mutex implementation configurable via feature flag, and provide one that either takes the mutex or panics. I've found it's really not difficult to only access the USB methods from a USB interrupt context, and would expect that a bad access pattern with a lock/panic mutex would result in a panic pretty quickly. So, this isn't really the cleanest solution, but it might be a good enough "pressure relief valve" for situations where people need to use USB along with lower-latency interrupts.

TDHolmes · 2021-06-15T02:39:21Z

for something along the lines of 3, you can check if you're in interrupt context with this cortex-m function and panic if you're in thread context

Sympatron · 2021-06-15T14:30:44Z

I think option 3 is the best at the moment. Should be fairly simple and that way you either access USB methods from different context safely or get maximum performance at the cost of panics if it's used from multiple contexts (which I think wouldn't be too difficult to avoid). Maybe mutexes should be opt-out to make the safer option the default.

Another option would be to let the user provide a mutex with the current one as the default.

TDHolmes · 2022-04-01T03:28:15Z

@ianrrees is there any update needed to this issue as a result of the recent USB PRs? Or is it the same story

ianrrees · 2022-04-01T03:33:41Z

Recent changes are unrelated, I was thinking about picking this back up though...

Any opposition to approach 3 from a few comments back?

TDHolmes · 2022-04-01T03:58:45Z

Nope, I think it makes good sense

This was referenced Jun 17, 2021

Make Send+Sync impls optional for compatibility with AVR rust-embedded-community/usb-device#70

Closed

Add methods for the USB Start Of Frame interrupt #456

Merged

ianrrees changed the title ~~USB methods disable interrupts for too long(?)~~ USB methods disable interrupts for too long Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USB methods disable interrupts for too long #397

USB methods disable interrupts for too long #397

ianrrees commented Feb 1, 2021

twitchyliquid64 commented Feb 1, 2021

ianrrees commented Feb 2, 2021 •

edited

Loading

twitchyliquid64 commented Feb 2, 2021

twitchyliquid64 commented Feb 2, 2021

ianrrees commented Feb 2, 2021 •

edited

Loading

twitchyliquid64 commented Feb 2, 2021

ianrrees commented Feb 4, 2021 •

edited

Loading

twitchyliquid64 commented Feb 4, 2021

ianrrees commented Feb 4, 2021 •

edited

Loading

ianrrees commented Feb 5, 2021

ianrrees commented Feb 8, 2021

ianrrees commented Mar 11, 2021

ianrrees commented Mar 23, 2021

bradleyharden commented Jun 14, 2021

ianrrees commented Jun 15, 2021

TDHolmes commented Jun 15, 2021 •

edited

Loading

Sympatron commented Jun 15, 2021

TDHolmes commented Apr 1, 2022

ianrrees commented Apr 1, 2022 •

edited

Loading

TDHolmes commented Apr 1, 2022

USB methods disable interrupts for too long #397

USB methods disable interrupts for too long #397

Comments

ianrrees commented Feb 1, 2021

twitchyliquid64 commented Feb 1, 2021

ianrrees commented Feb 2, 2021 • edited Loading

twitchyliquid64 commented Feb 2, 2021

twitchyliquid64 commented Feb 2, 2021

ianrrees commented Feb 2, 2021 • edited Loading

twitchyliquid64 commented Feb 2, 2021

ianrrees commented Feb 4, 2021 • edited Loading

twitchyliquid64 commented Feb 4, 2021

ianrrees commented Feb 4, 2021 • edited Loading

ianrrees commented Feb 5, 2021

ianrrees commented Feb 8, 2021

ianrrees commented Mar 11, 2021

ianrrees commented Mar 23, 2021

bradleyharden commented Jun 14, 2021

ianrrees commented Jun 15, 2021

TDHolmes commented Jun 15, 2021 • edited Loading

Sympatron commented Jun 15, 2021

TDHolmes commented Apr 1, 2022

ianrrees commented Apr 1, 2022 • edited Loading

TDHolmes commented Apr 1, 2022

ianrrees commented Feb 2, 2021 •

edited

Loading

ianrrees commented Feb 2, 2021 •

edited

Loading

ianrrees commented Feb 4, 2021 •

edited

Loading

ianrrees commented Feb 4, 2021 •

edited

Loading

TDHolmes commented Jun 15, 2021 •

edited

Loading

ianrrees commented Apr 1, 2022 •

edited

Loading