Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing SETUP DataOut packets #94

Closed

Conversation

haata
Copy link
Contributor

@haata haata commented Jun 4, 2022

  • Missing return ignores less common SETUP packets such as SET_REPORT
    for HID keyboards
  • Simplfied returns
  • Replaced commented out sprintf with defmt version (disabled by
    default)

- Missing return ignores less common SETUP packets such as SET_REPORT
  for HID keyboards
- Simplfied returns
- Replaced commented out sprintf with defmt version (disabled by
  default)
Copy link
Member

@ryan-summers ryan-summers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some questions as to why we may not have originally returned the request.

Also, can you add a CHANGELOG entry?

}

None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on these changes, there's actually no code path that returns None, so there's not much point in using an Option, is there? Why not just return a Result<Request, Error> instead?

Was there originally intent for returning None from this function for valid inbound Requests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll dig through the previous commits to see what the author was intending. But yeah, I'd prefer to return a result too.

From what I've seen so far it's mostly been to mask errors. In general you don't want errors to block USB communication if possible (and just keep going). However, there are situations (like this one) where an error is basically fatal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing to add, this situation isn't very common. Most USB devices don't use the Control endpoint for data communication. In the USB HID spec (I believe) if an OUTPUT endpoint isn't defined for the interface the USB HID lock leds are transmitted through the Control endpoint.
In my experience with USB devices so far, this only applies to keyboards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through the commits, I don't think this case has ever been handled correctly. This part of the code has been refactored a lot and I suspect has never been tested. After enough digging, even the original commit has the same issue.
Generally for USB you can ignore something (i.e. STALL) if you don't support it. However, you can't do this for parts of the spec that are required.

989c10d <- Refactor
503ecf7 <- More refactoring
963cc6b <- More refactoring
5db5a89 <- Initial commit

@mvirkkunen
Copy link
Collaborator

mvirkkunen commented Jun 5, 2022

handle_setup returns None for control OUT transfers because the entire transfer hasn't been received yet and it's waiting for data packets, the SETUP request is stored in self.state and will eventually be returned by handle_out. The built-in test class tests for control OUT transfers as well and they do generally work.

Do you have a testcase that fails without these changes and works with them?

@haata
Copy link
Contributor Author

haata commented Jun 5, 2022

handle_setup returns None for control OUT transfers because the entire transfer hasn't been received yet and it's waiting for data packets, the SETUP request is stored in self.state and will eventually be returned by handle_out. The built-in test class tests for control OUT transfers as well and they do generally work.

Do you have a testcase that fails without these changes and works with them?

The problem I have here is during enumeration on Linux (I'm testing with a real device).

The process goes as such (I can provide hardware traces later if that helps)

  1. USB device is enumerated
  2. USB interface is enumerated (a HID keyboard is found)
  3. The Linux HID stack sends SET_REPORT over the control interface for the enumerated interface
  4. handle_setup is called with a 1 byte buffer that indicates the current state of the lock LEDs (e.g. NumLock + CapsLock + ScrollLock)
  • This is the important part. The USB stacks I've worked with so far will optimize this to auto-reply for zero length control data packets.
  • Since this is a 1 byte buffer, we need to give an actual response. With the current API the only way to check again is to check poll, but this won't work in my case as poll is only triggered on USB interrupts (and there are no pending interrupts as we've already read the ControlOut packet and the host is waiting for the reply).
  • https://github.com/twitchyliquid64/usbd-hid/blob/master/src/hid_class.rs#L651 is where the ControlOut packet is handled in my case.

Basically, if there is a real error in response to the ControlOut data packet (e.g. buffer tool small) we should return an error and not hide it and assume all packets are good.
We're also required to respond to Setup packets within a certain amount of time (https://www.beyondlogic.org/usbnutshell/usb6.shtml#SetupPacket) within 500 ms (the host won't send anything while it's waiting).

Looking more carefully at the Option usage. It's ok, but there isn't a good way to indicate buffer overflows (and other internal errors). You just have to know that's why transmissions stop working. I spent a few hours tracing code to figure this (#95) was my issue (tracing live code isn't possible without defmt due to the timing requirements and how slow RTT is for normal messages). But yeah, that's mostly an annoyance.

@stappersg
Copy link
Contributor

Do you have a testcase that fails without these changes and works with them?

The problem I have here is during enumeration on Linux (I'm testing with a real device).

The process goes as such (I can provide hardware traces later if that helps)

....

Please provide a testcase that fails without these changes and works with them.

@mvirkkunen
Copy link
Collaborator

mvirkkunen commented Jun 5, 2022

What does "handle_setup is called with a 1 byte buffer" mean exactly? The SETUP data packet is always 8 bytes long. Possible data will come later, and will be handled by handle_out. Do you have a trace log where handle_setup is reading a packet with a length that's not 8? And the status stage of the control transfer (assuming that what you mean by "we need to give an actual response") is the same whether there is a data stage or not.

Which driver are you using? For what it's worth, this could be due to packets being read in the wrong order because the way it works currently was designed badly (by me).

@haata
Copy link
Contributor Author

haata commented Jun 6, 2022

I'm sorry for the noise, I think something else must be going on. It was 100% reproducible yesterday. I have been working on remote wakeup and dealing with the control buffer size (so something else may have shuffled around).

Will close this for now and come back with a lot more info and traces if I can repro it.

@haata haata closed this Jun 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants