Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream this driver #2

Closed
pfink-christ opened this issue Sep 24, 2021 · 78 comments
Closed

Upstream this driver #2

pfink-christ opened this issue Sep 24, 2021 · 78 comments

Comments

@pfink-christ
Copy link

pfink-christ commented Sep 24, 2021

We would like to get this driver into mainline linux kernel. Has anybody already done work in that direction? @ericevenchick or somebody else?
I think that it won't be accepted as a new driver as it is right now, because it's largely a copy of what's already there in gs_usb.c.

In my opinion the best way would be to adapt the upstream gs_usb.c in a way that makes it work with the original gs_usb devices and autodetect the newer fd devices. Opinions, thoughts?

I also saw @marckleinebudde and @hartkopp around here having likely the most experience with mainline patch submitting....what do you think? Would you keep it a separate driver?

We also have some improvements ready like @moechr shared the errata bug at linklayer/cantact#19

This would solve #1, too.

@brainstorm
Copy link

We also have some improvements ready like @moechr shared the errata bug at linklayer/cantact#19

Those "improvements" are workarounds, the right place to fix that bug is in https://github.com/linklayer/cantact-pro-fw instead, not on gs_usb, IMO. It'd be nice to see FD support upstreamed, that's for sure.

@moechr
Copy link

moechr commented Sep 24, 2021

the right place to fix that bug is in https://github.com/linklayer/cantact-pro-fw instead

Well, NXPs solution for this errata is to send NAK. If the driver is still sending messages in the format of the length 4 + N*16 and there is no change in the firmware the problem would still exist.
We made the decision as a reason of performance to add 1 byte to the host frame instead of sending a NAK transfer without any informational data.

There are other things to even worry more about in the firmware. Take a closer look at the can.c RAM assignment. One RX FIFO element uses 64 bytes data + 8 bytes of additional RAM per element = 72 bytes
72 bytes * 16 elements = 0x480 bytes
The difference between CAN0_TX_BUFFER_OFS = 0x20 and CAN1_STD_FILTER_OFS = 0x100 is only 0x80 ...

@brainstorm
Copy link

brainstorm commented Sep 25, 2021

Ah, I see, thanks for explaining @moechr, much appreciated!

I've been quite disappointed by the overall CANtact "Pro" quality lately. I guess what upsets me the most is @ericevenchick not showing up for support after the @crowdsupply campaign (/cc @storborg @lifton?)... despite multiple bugs piling up for userspace tooling, documentation and firmware sides for the affected cantact repositories.

At this point I wouldn't recommend this dongle for any production use until the firmware, tooling and docs don't see thorough testing, bug squashing and benchmarking so that folks know the actual limits of this thing... it's definitely not worth 140eur at this point in time, IMO.

@moechr
Copy link

moechr commented Sep 25, 2021

@brainstorm I totally agree with you. Simple "loop-tests" as a verification of the performance of a CAN coupler are way to less.
The big ones like PEAK Systems perform a whole bunch of tests on their CAN products.

There is a lot to to on the Cantact Pro project. As @ericevenchick mentioned - his CAN products are for hacking - maybe the hacking starts on the product itself ;)

@BennyEvans
Copy link

@pfink-christ as discussed, I've moved our discussion here (responding to your comment in #3 )

The candlelight firmware support is already there so why not add further support for a candlelight version with FD. I don't see a problem with that. The alternative would be huge code duplication or splitting the driver in a gs_usb_common.c and two separate versions of gs_usb.c /gs_usb_fd.c where only the differences are handled. That's another possible solution I have to give some additional thought, because the current status is not perfect, still hacky and not quite mainline quality yet (also comfirmed by hartkopp

Yep, this makes sense, I agree. As for the hack, I assume you're talking about u8 errata_dummy; added to the gs_host_frame struct to fix the NXP errata issue. Bit of a funny story - when first starting to develop my CAN bus hardware I had chosen the LPC54628. After a month or so I got to the point of stress testing the USB and quickly found this issue. I spent a few weeks trying to fix it but after that issue and a few others in the errata I ended up changing to an STM32H7. Sucks to see people creating similar devices running into the same issues. I'll go back over my notes and see where I got to with that issue as it'd be much nicer to solve the issue in the device firmware rather than the gs_usb driver.

What's this candlelight group you both talked about? I don't think there is actually a group in the sense that they sit together and discuss what to do next. I guess there is a maintainer, a small number of contributers and a bunch of users. But let's find out and ask. I guess that might be the main repository? -> https://github.com/candle-usb/candleLight_fw So Hubert Denkmair seems to be the initial author and fenugrec seems to currently maintain the repository.
We could add a comment here that bit 8 is used as FD feature flag or something like that, then it would be documented and nobody could complain -> https://github.com/candle-usb/candleLight_fw/blob/e6b972441b767b8952d9cd8c78cd74a43be091a9/src/usbd_gs_can.c#L267

Yeah spot on, I believe https://github.com/candle-usb/candleLight_fw is the main repository. I'll try to contact them over the next week or so in an attempt to get them involved in this. I think that as long as they're aware of it, we'll be fine. As far as discussion goes, can we open up a discussion board on git (https://docs.github.com/en/discussions/quickstart) even if that's in your repo? Or would you prefer to keep discussion here?

Above all testing is always welcome. But I think first priority should be to find another way of implementing the "switch" between classic can and FD. Because that is the one thing currently keeping me from sending the patches to ML for review and my time is rather limited at the moment.

I've got the same problem with time at the moment but will try to take a look at this over the weekend. I'll fork your repo, add my VID/PID, tidy a couple of things up and will submit a PR. Then we can discuss further on how to proceed.

Cheers

@marckleinebudde
Copy link

Yeah spot on, I believe https://github.com/candle-usb/candleLight_fw is the main repository. I'll try to contact them over the next week or so in an attempt to get them involved in this.

The easiest would be to create a PR where you add the various bits for CAN_FD support.

@pfink-christ
Copy link
Author

Yep, this makes sense, I agree. As for the hack, I assume you're talking about u8 errata_dummy; added to the gs_host_frame struct to fix the NXP errata issue.

No, I'm talking about the length of the transmitted gs_host_frame. Because the data part has to have

  • 8 data bytes in case of classic can device
  • 64 data bytes in case of can fd device
  • 65 (64 data bytes + 1 byte to circumvent data corruption on usb) in case of fd with the affected nxp mcu(s).

In my previous implementation the struct had the maximum size and only the necessary length was transmitted via usb. The struct was cut off so to speak. That was the hacky part where I wanted to find a better solution.

it'd be much nicer to solve the issue in the device firmware rather than the gs_usb driver.

USB high-speed device in endpoint TX data corruption on NXP LPC546xx controllers (Errata sheet LPC546xx / USB.15)
This corruption occurs when the following conditions are met:
-> A TX (IN) transfer happens after a RX (OUT) transfer.
-> The RX (OUT) transfer length is 4 + N*16 (N>=0) bytes.

IMHO not easily possible. The suggested workaround by NXP would require to add some usb rx/tx control and detection when to send a NAKed TX transfer. I don't think that's useful. Sending one byte more in this case isn't too hard on the performance, whereas NAKing a transfer and causing complete dummy transfers would sum up in a bigger total performance dip. At least that is how I see it in theory. As a benefit it would keep the complexity on the firmware and kernel driver side as it is.

I reworked the code and found a - in my eyes - quite good solution without changing too much.

@marckleinebudde Could you help me out on this macro here? Do I really need to supply data[0] to sizeof_field, which looks a bit weird here or is there another clever kernel macro that I missed to calculate the length of such a flex array? The compiler doesn't complain anymore, but I have no access to the hardware right now. I will test my reworked code on Friday, then I'll be able to share more.

struct gs_host_frame {
	u32 echo_id;
	__le32 can_id;

	u8 can_dlc;
	u8 channel;
	u8 flags;
	u8 reserved;

	DECLARE_FLEX_ARRAY(u8, data);
} __packed;

#define GS_HOSTFRAME_SIZE(data_len) (sizeof(struct gs_host_frame) + sizeof_field(struct gs_host_frame, data[0]) * data_len)

The easiest would be to create a PR where you add the various bits for CAN_FD support.

That was also my line of thinking. Should we also list/define the feature bits in the kernel driver even if not used? 6 and 7 are not used at the moment but are defined in the candlelight firmware. Currently I had them removed, but I guess it wouldn't hurt adding them back.

@pfink-christ
Copy link
Author

pfink-christ commented Nov 24, 2021

after that issue and a few others in the errata

@BennyEvans are there any other errata relevant for us (NXP LPC... users) to additionally keep in mind or were they only relevant on your custom hardware? Because I don't think we are aware of any others.
And right now would be a good time to workaround all existing and known bugs and kinks.

@marckleinebudde
Copy link

That was also my line of thinking. Should we also list/define the feature bits in the kernel driver even if not used? 6 and 7 are not used at the moment but are defined in the candlelight firmware. Currently I had them removed, but I guess it wouldn't hurt adding them back.

I have already some patches that add these bits to the driver. Are you subscribed to the linux-can mailing list? I'll send these patches around.

@marckleinebudde
Copy link

@marckleinebudde Could you help me out on this macro here? Do I really need to supply data[0] to sizeof_field, which looks a bit weird here or is there another clever kernel macro that I missed to calculate the length of such a flex array? The compiler doesn't complain anymore, but I have no access to the hardware right now. I will test my reworked code on Friday, then I'll be able to share more.

Have a look at struct_size https://elixir.bootlin.com/linux/v5.15/source/include/linux/overflow.h#L192

@pfink-christ
Copy link
Author

pfink-christ commented Nov 25, 2021

I have already some patches that add these bits to the driver. Are you subscribed to the linux-can mailing list? I'll send these patches around.

Yes, I am subscribed.

Have a look at struct_size https://elixir.bootlin.com/linux/v5.15/source/include/linux/overflow.h#L192

Thanks. That works for the instances where I have a pointer to gs_host_frame available, but not in gs_can_open, where gs_host_frame is not used, but only its size is relevant e.g. for allocation of usb rx buffer with usb_alloc_coherent. I could add a pointer to gs_host_frame, but it would only be used to calculate the size with struct_size.

old: #define GS_HOSTFRAME_SIZE(data_len) (sizeof(struct gs_host_frame) + sizeof_field(struct gs_host_frame, data[0]) * data_len)
new: #define GS_HOSTFRAME_SIZE(p, data_len) struct_size(p, data, data_len)
Apart from that the macro utilizing struct_size needs two parameters. I think I will rather discard GS_HOSTFRAME_SIZE macro altogether and use struct_size directly in every occurrence (7 in total), because it's saving only one parameter and not really improving readability.

@marckleinebudde
Copy link

Thanks. That works for the instances where I have a pointer to gs_host_frame available, but not in gs_can_open, where gs_host_frame is not used, but only its size is relevant e.g. for allocation of usb rx buffer with usb_alloc_coherent. I could add a pointer to gs_host_frame, but it would only be used to calculate the size with struct_size.

Make it so.

old: #define GS_HOSTFRAME_SIZE(data_len) (sizeof(struct gs_host_frame) + sizeof_field(struct gs_host_frame, data[0]) * data_len)
new: #define GS_HOSTFRAME_SIZE(p, data_len) struct_size(p, data, data_len)
Apart from that the macro utilizing struct_size needs two parameters. I think I will rather discard GS_HOSTFRAME_SIZE macro altogether and use struct_size directly in every occurrence (7 in total), because it's saving only one parameter and not really improving readability.

Sounds good.

@BennyEvans
Copy link

after that issue and a few others in the errata

@BennyEvans are there any other errata relevant for us (NXP LPC... users) to additionally keep in mind or were they only relevant on your custom hardware? Because I don't think we are aware of any others. And right now would be a good time to workaround all existing and known bugs and kinks.

I've just gone over my notes. Errata "3.22 USB.15: USB high-speed device in endpoint TX data corruption" is the only relevant issue that I'm aware of.

@BennyEvans
Copy link

@pfink-christ may you please provide a link to the repository you're working out of? I submitted a PR to the branch and repository I thought you were working out of but you then deleted the branch about 10 minutes after which closed the PR. Just want to all be on the same page here.

@BennyEvans
Copy link

BennyEvans commented Nov 26, 2021

Hi all,

I've created a fork of this repo and implemented compatibility in the driver for both existing classic CAN devices and newer FD devices. I'd appreciate if you could have a look over it and test the functionality. Unfortunately, I don't currently have access to an FD device so I have only confirmed that's it's working with a Canable device running the candlelight firmware.

Link to the repo: https://github.com/BennyEvans/gs_usb_fd

I took @pfink-christ version when it was available and have updated the gs_device_bt_const and gs_host_frame structs to be more compatible with both Classic and FD CAN simultaneously. The bit timing request now initially request an original classic can structure (FD devices will just truncate and send a classic structure) and only after checking for FD capability after the initial timing request will it request the full FD bit timing structure. I've also removed the VID gates that were in @pfink-christ revision and now if a device reports FD capability, the driver will treat it as an FD device.

@marckleinebudde as you've had some experience modifying this driver, I'd appreciate any feedback.

@pfink-christ Apologies if you were working out of another repository. If you were and can share it, perhaps we can analyse each other's versions and move forward towards something we can submit to mainline. I've kept both our devices VID/PID in my version.

Edit: Feel free to submit any PRs.

Cheers

@marckleinebudde
Copy link

marckleinebudde commented Nov 26, 2021

Hey @BennyEvans,

@marckleinebudde as you've had some experience modifying this driver, I'd appreciate any feedback.

Make sure to have a clean patch stack. Only change one thing at a time. Feel free to use https://github.com/marckleinebudde/linux/commits/gs_usb_fd as a base. It already adds the missing bit definitions to the driver. Speaking of clean patches, I myself have to add proper patch descriptions to all patches.

@BennyEvans
Copy link

Thanks @marckleinebudde

I've not pushed code into mainline before so it's all a bit of a learning curve. I've forked the repository you've linked and will commit changes to that in smaller chunks tomorrow. Is this what's expected?

However, this being said, I'll still require someone to test my version of the driver with a CAN FD device as I currently don't have access to one.

Cheers

@marckleinebudde
Copy link

I've not pushed code into mainline before so it's all a bit of a learning curve. I've forked the repository you've linked and will commit changes to that in smaller chunks tomorrow. Is this what's expected?

ACK

However, this being said, I'll still require someone to test my version of the driver with a CAN FD device as I currently don't have access to one.

I have a board here.

@pfink-christ
Copy link
Author

pfink-christ commented Nov 26, 2021

@BennyEvans my current working repository is not online because I want to test my code properly before I publish something. That's also the reason I deleted my old working branch, so that nobody uses or works on this not anymore up to date codebase.
I don't quite understand why you want to do the same work again that I already did. Either you do it or me. Not both please. I suggest I will upload my code later today, then you can try to improve or add things.

Uff... @marckleinebudde I rebased my changes on your repository, but I realized that your commit marckleinebudde/linux@86f7bf6, which adds the timestamp after the data field in the struct, kinda killed my whole idea of the flex array. What do you think about the idea of including the timestamp in the data array or is something like the following better? The size calculation would have to be adapted as well (decrease the flex array length by the size of the struct classic_can_data). This would assume there is no timestap data after the 64 data bytes in the CAN FD case or it would get ugly again :D

struct gs_host_frame {
        u32 echo_id;
        __le32 can_id;

        u8 can_dlc;
        u8 channel;
        u8 flags;
        u8 reserved;

        union {
                struct classic_can_data {
                        u8 data_c8[8];
                        u32 timestamp_us;
                } __packed;
                DECLARE_FLEX_ARRAY(u8, data);
        } __packed;
} __packed;

@marckleinebudde
Copy link

marckleinebudde commented Nov 26, 2021

Although time stamps are not supported in the Linux driver yet, we should keep this in mind.

What do you think about the idea of including the timestamp in the data array or is something like the following better?

I'd like to have the timestamp as a separate u32. This looks good, given the compiler likes this :)

struct gs_host_frame {
        u32 echo_id;
        __le32 can_id;

        u8 can_dlc;
        u8 channel;
        u8 flags;
        u8 reserved;

        union {
                struct {

I think you can use an anonymous struct.

                        u8 data_cc[8];

Let's call it data_cc for Classical CAN.

                        u32 timestamp_us;
                } __packed;
                DECLARE_FLEX_ARRAY(u8, data);
        } __packed;
} __packed;

@BennyEvans
Copy link

@BennyEvans my current working repository is not online because I want to test my code properly before I publish something. That's also the reason I deleted my old working branch, so that nobody uses or works on this not anymore up to date codebase. I don't quite understand why you want to do the same work again that I already did. Either you do it or me. Not both please. I suggest I will upload my code later today, then you can try to improve or add things.

@pfink-christ no problem, don't want to step on any toes, just trying to help. I'll wait until you've made your code public before proceeding any further. Thanks for your work on all this. Please keep me updated.

@brainstorm
Copy link

brainstorm commented Nov 26, 2021

Hi everybody, I hope this doesn't come across as offtopic, but after relatively long intensive runs of gs_usb_fd with CANtact Pro, I'm getting those OOM messages fairly consistently (fails at around the 4-5th day):

[632928.354876] gs_usb_fd 1-1.3:1.0: Configuring for 2 interfaces
[632928.394207] usbcore: registered new interface driver gs_usb_fd
[632929.568473] cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
[632929.568538] gs_usb_fd 1-1.3:1.0 can0: No memory left for USB buffer
[632940.440915] usbcore: deregistering interface driver gs_usb_fd
[632940.804206] gs_usb_fd 1-1.3:1.0: Configuring for 2 interfaces
[632940.829411] usbcore: registered new interface driver gs_usb_fd
[632942.007201] cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
[632942.007273] gs_usb_fd 1-1.3:1.0 can0: No memory left for USB buffer

Possible driver memory leak?

$ free
              total        used        free      shared  buff/cache   available
Mem:         440412      103808       29892       22432      306712      260840
Swap:        102396         256      102140

$ cat /proc/cpuinfo
processor	: 0
model name	: ARMv6-compatible processor rev 7 (v6l)
BogoMIPS	: 697.95
Features	: half thumb fastmult vfp edsp java tls
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xb76
CPU revision	: 7

Hardware	: BCM2835
Revision	: 000e
Serial		: 000000007c696cc1
Model		: Raspberry Pi Model B Rev 2

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        15G  6.7G  7.1G  49% /
devtmpfs        183M     0  183M   0% /dev
tmpfs           216M     0  216M   0% /dev/shm
tmpfs           216M   22M  194M  11% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           216M     0  216M   0% /sys/fs/cgroup
/dev/mmcblk0p1  253M   49M  204M  20% /boot
tmpfs            44M     0   44M   0% /run/user/1000

$ uname -a
Linux raspberrypi 5.10.17+ #1421 Thu May 27 13:58:02 BST 2021 armv6l GNU/Linux

How could I debug this further?

Restarting the interface is ineffective, it needs a full reboot to be back into working shape.

@marckleinebudde
Copy link

@brainstorm
Copy link

brainstorm commented Nov 26, 2021

you can try to increase the CMA size: https://elixir.bootlin.com/linux/v5.15/source/Documentation/admin-guide/kernel-parameters.txt#L638

Thanks! What'd you put as a reasonable value given the hardware outlined?

@marckleinebudde
Copy link

Look at your kernel log and search for CMA, double that size:

dmesg | grep -i cma

@brainstorm
Copy link

Thanks! That explains things, before rebooting:

$ dmesg | grep -i cma
[633678.406776] cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
[633690.968344] cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
[633703.442953] cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
[633715.949471] cma: cma_alloc: alloc failed, req-size: 1 pages, ret: -12
(...)

After reboot:

$ dmesg | grep cma
[    0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[    0.000000] Memory: 374456K/458752K available (8638K kernel code, 1324K rwdata, 2816K rodata, 420K init, 837K bss, 18760K reserved, 65536K cma-reserved)
[   19.635959] vc_sm_cma: module is from the staging directory, the quality is unknown, you have been warned.
[   19.657850] bcm2835_vc_sm_cma_probe: Videocore shared memory driver

So I bumped it to cma=128M via /boot/cmdline.txt:

[    0.000000] Kernel command line: coherent_pool=1M snd_bcm2835.enable_compat_alsa=0 snd_bcm2835.enable_hdmi=1 bcm2708_fb.fbwidth=640 bcm2708_fb.fbheight=480 bcm2708_fb.fbswap=1 vc_mem.mem_base=0x1ec00000 vc_mem.mem_size=0x20000000  console=ttyAMA0,115200 console=tty1 root=PARTUUID=4c9fbae7-02 rootfstype=ext4 elevator=deadline fsck.repair=yes cma=128M rootwait
[    0.000000] Memory: 308792K/458752K available (8638K kernel code, 1324K rwdata, 2816K rodata, 420K init, 837K bss, 18888K reserved, 131072K cma-reserved)

I'll report back if it fails again (not entirely sure why gs_usb_fd would require more than 64MB contiguous alloc though?), thanks for the suggestion in any case ;)

@marckleinebudde
Copy link

I think the memory gets fragmented over time....

@pfink-christ
Copy link
Author

I'd like to have the timestamp as a separate u32. This looks good, given the compiler likes this :)

My compiler is currently fine with it, but the setup from my colleague had some issues with DECLARE_FLEX_ARRAY on Friday. I think some additional tests would be good once the last issue below is cleared.

I think you can use an anonymous struct.

The idea was to be able to do something like this:

struct gs_host_frame {
        u32 echo_id;
        __le32 can_id;

        u8 can_dlc;
        u8 channel;
        u8 flags;
        u8 reserved;

        union {
                struct classic_can {
                        u8 data_cc[8];
                        u32 timestamp_us;
                } __packed;
                DECLARE_FLEX_ARRAY(u8, data);
        } __packed;
} __packed;

...

gs_hf_size = struct_size(hf, data, hf_data_len) - sizeof(struct classic_can);

Because now struct_size reports size = 88 when hf_data_len = 64. If my math is correct it's exactly sizeof(struct classic can) = 12 too long. Using a macro again would be a practical solution or reducing hf_data_len by 12, which I don't like for obvious reasons.

@marckleinebudde
Copy link

Use pahole /path/to/gs_usb.o to find out how the compiler placed the struct in memory.

@pfink-christ
Copy link
Author

Thanks for your work. I've done some changes

You are welcome. =) ...quite some more changes 😄 Awesome!

We tried to reproduce your setup, where you are losing frames, but not even CANtact Pro lost frames with these commands in our setup (I was told that still the original CANtact Pro firmware is on our device):

ip link set can0 type can bitrate 1000000 dbitrate 4000000 fd on
cangen can0 -Di -L1 -I2 -p10 -g0

(Receiving system with peak usb can fd adapter):
cansequence -rv can0

What bitrate did you use? When omitting -p10 we could exhaust the buffer and cangen aborts (write: No buffer space available), but apart from that no lost frames were reported.

I missed that yesterday evening at first and it seems you pulled it faster than I was able to force push my reword (or you changed it back? - I looked it up, because I wasn't sure myself) -> marckleinebudde/linux@f2c7dab "an union" -> "a union"

as you've added hardware for your employer's hardware, what about adding you to the MAINTAINERS file in the kernel. This means when a patch targets this driver, you'll get an email to review and test the changes.

I have no objections. You can add me or I can do it myself in a follow up commit.

  • rebased to latest net-next/master + my commits with enhanced patch descriptions
  • removed struct classic_can_ts as it's not implemented yet
  • renamed GS_CAN_FEATURE_REQ_USB_QUIRK to GS_CAN_FEATURE_REQ_USB_QUIRK_LPC546XX

Works for me.

  • don't use data[65] in struct canfd_quirk but an extra u8 dummy

Makes sense. I think this survived from a former implementation before the union and I was too focused on changing as little as possible.

  • renamed struct gs_can::gs_hf_size to gs_can::hf_size

ACK

  • calculate gs_can::hf_size based of whether CAN-FD is requested by user or not

Wasn't completely aware of classic_can being used if the device supports CAN-FD, but you are of course right.

  • added a struct classic_can_quirk

I thought about it and because we didn't see any corruption there I later forgot about that possibility. Or maybe we were just focused too much on testing FD, that we did't notice it. Thanks for adding it.

  • don't use union gs_device_bt_const but separate structs

Again remains from the previous implementation cycle where this was necessary. Sorry for not cleaning up properly.

  • make request/setting of GS_CAN_FEATURE_BT_CONST_EXT dependent on GS_CAN_FEATURE_FD

ACK

  • assign missing name to data_bt_const

ACK

  • only assign data_bittiming_const and do_set_data_bittiming if GS_CAN_FEATURE_FD

ACK

Tested and Signed-off-by me :)

Seems like @ericevenchick has the last word then.

@marckleinebudde
Copy link

I missed that yesterday evening at first and it seems you pulled it faster than I was able to force push my reword (or you changed it back? - I looked it up, because I wasn't sure myself) -> marckleinebudde/linux@f2c7dab "an union" -> "a union"

Yes I "fixed" that, but just found out, that "a union" is correct. Will fix.

Seems like @ericevenchick has the last word then.

Let's give him some time. He has modified the original GPL driver, so the https://github.com/linklayer/gs_usb_fd driver is GPL'ed as well. He even posted a link to the driver on the linux-can ML https://lore.kernel.org/all/[email protected], but didn't include a S-o-b. 😞

We tried to reproduce your setup, where you are losing frames, but not even CANtact Pro lost frames with these commands in our setup (I was told that still the original CANtact Pro firmware is on our device):

ip link set can0 type can bitrate 1000000 dbitrate 4000000 fd on
cangen can0 -Di -L1 -I2 -p10 -g0

(Receiving system with peak usb can fd adapter):

cansequence -rv can0

What bitrate did you use?

500 kbit/s on a imx6 single core

When omitting -p10 we could exhaust the buffer and cangen aborts (write: No buffer space available), but apart from that no lost frames were reported.

The no buffer space available error is expected to happen....I'll test on other SoCs with other USB host controllers.

@BennyEvans
Copy link

Thanks for all of your work on this @pfink-christ and @marckleinebudde

@pfink-christ
Copy link
Author

500 kbit/s on a imx6 single core
The no buffer space available error is expected to happen....I'll test on other SoCs with other USB host controllers.

We actually tested on a dual core i.MX6DL where we didn't see any lost packages, but maybe we missed something else in our setup. 🤷

@ericevenchick
Copy link
Contributor

Hi all. Thanks for all the work on this. I've been away from this project for some personal reasons but it's great to see the work done here!

Patch is signed-off-by me too.

marckleinebudde added a commit to marckleinebudde/candleLight_fw that referenced this issue Dec 30, 2021
The CANtact FD and other devices implement a gs-usb compatible USB
protocol. The protocol has been extended to support CAN-FD and the
changes on the Linux gs_usb driver will be mainlined soon.

This patch adds the new GS_CAN_MODE_*, GS_CAN_FEATURE_* and
GS_CAN_FLAG_* bits as well as struct gs_host_frame_canfd to the
candlelight driver.

This is mainly for documentation purpose, as the STM32F042 and
STM32F072 don't support CAN-FD. But there are some ports to CAN-FD
capable STM32 µC that can make use of these definitions.

[1] linklayer/gs_usb_fd#2
@marckleinebudde
Copy link

marckleinebudde commented Dec 30, 2021

Hello @BennyEvans,

Yeah spot on, I believe https://github.com/candle-usb/candleLight_fw is the main repository. I'll try to contact them over the next week or so in an attempt to get them involved in this. I think that as long as they're aware of it, we'll be fine. As far as discussion goes, can we open up a discussion board on git (https://docs.github.com/en/discussions/quickstart) even if that's in your repo? Or would you prefer to keep discussion here?

I've send a PR that adds the CAN-FD related bits to the candlelight fw: candle-usb/candleLight_fw#79

@fenugrec
Copy link

Hi , just dropping in to introduce myself - I'm indeed the de-facto maintainer of candleligh_fw (Hubert "gave me the keys" and has not been involved recently). This seems like a low-impact change for our firmware (the stm32f042 / f072 we target do not even support CAN-FD) and have no objection to cooperating. The less quirks and special cases in the mainline kernel driver, the better. We've had some users doing some isolated work on FD (e.g. candle-usb/candleLight_fw#21 ) but not sure what came of it.

@marckleinebudde
Copy link

Hey,

I just found an overlap in the USB request ids:

enum gs_usb_breq {
	GS_USB_BREQ_HOST_FORMAT = 0,
	GS_USB_BREQ_BITTIMING,
	GS_USB_BREQ_MODE,
	GS_USB_BREQ_BERR,
	GS_USB_BREQ_BT_CONST,
	GS_USB_BREQ_DEVICE_CONFIG,
	GS_USB_BREQ_TIMESTAMP,
	GS_USB_BREQ_IDENTIFY,
	GS_USB_BREQ_DATA_BITTIMING, // == GS_USB_BREQ_GET_USER_ID
	GS_USB_BREQ_BT_CONST_EXT,   // == GS_USB_BREQ_SET_USER_ID
};

As the GS_USB_BREQ_{GET,SET}_USER_ID are older, we have to move the GS_USB_BREQ_DATA_BITTIMING and GS_USB_BREQ_BT_CONST_EXT to a new value.

As the CANtact Pro is out there I'll add a workaround for the GS_USB_BREQ_DATA_BITTIMING. I'll decide on the manufacturer and product strings.

Do we need workarounds for other devices and/or GS_USB_BREQ_BT_CONST_EXT, too? How should we detect if we need to enable these workarounds?

regards,
Marc

marckleinebudde added a commit to marckleinebudde/candleLight_fw that referenced this issue Dec 30, 2021
The CANtact FD and other devices implement a gs-usb compatible USB
protocol. The protocol has been extended to support CAN-FD and the
changes on the Linux gs_usb driver will be mainlined soon.

This patch adds the new GS_CAN_MODE_*, GS_CAN_FEATURE_*,
GS_USB_BREQ_DATA_BITTIMING, and GS_CAN_FLAG_* bits as well as struct
gs_host_frame_canfd to the candlelight driver.

This is mainly for documentation purpose, as the STM32F042 and
STM32F072 don't support CAN-FD. But there are some ports to CAN-FD
capable STM32 µC that can make use of these definitions.

[1] linklayer/gs_usb_fd#2
@BennyEvans
Copy link

Hello @BennyEvans,

Yeah spot on, I believe https://github.com/candle-usb/candleLight_fw is the main repository. I'll try to contact them over the next week or so in an attempt to get them involved in this. I think that as long as they're aware of it, we'll be fine. As far as discussion goes, can we open up a discussion board on git (https://docs.github.com/en/discussions/quickstart) even if that's in your repo? Or would you prefer to keep discussion here?

I've send a PR that adds the CAN-FD related bits to the candlelight fw: candle-usb/candleLight_fw#79

Apologies @marckleinebudde, I had completely forgotten about doing this. Thanks for taking the lead on it.

As the CANtact Pro is out there I'll add a workaround for the GS_USB_BREQ_DATA_BITTIMING. I'll decide on the manufacturer and product strings.

Do we need workarounds for other devices and/or GS_USB_BREQ_BT_CONST_EXT, too? How should we detect if we need to enable these workarounds?

I believe the CANtact Pro is currently the only production device using this driver for CAN FD and thus, no other production device should be using the GS_USB_BREQ_DATA_BITTIMING feature. I think it would be safe to add a workaround for just that device. The GS_USB_BREQ_BT_CONST_EXT feature is something we've introduced so would only be an issue moving it if @pfink-christ has an objection with his hardware. Might be best to wait for his input.

Cheers,
Ben

@marckleinebudde
Copy link

marckleinebudde commented Dec 30, 2021

The GS_USB_BREQ_BT_CONST_EXT feature is something we've introduced so would only be an issue moving it if @pfink-christ has an objection with his hardware. Might be best to wait for his input.

Makes sense.

@pfink-christ
Copy link
Author

First of all - happy new year to all 🎉🧨💥 sorry for being a bit late....didn't check my business mails any earlier.

As the GS_USB_BREQ_{GET,SET}_USER_ID are older, we have to move the GS_USB_BREQ_DATA_BITTIMING and GS_USB_BREQ_BT_CONST_EXT to a new value.

Thanks @marckleinebudde for your scrutiny! No objections from my side, as we can still move the bits around in our firmware to match the driver. We currently only have a low number of test devices around, where updating the firmware will be no problem at all and become necessary anyway when mainline support is there.

fenugrec pushed a commit to fenugrec/candleLight_fw that referenced this issue Jan 12, 2022
The CANtact FD and other devices implement a gs-usb compatible USB
protocol. The protocol has been extended to support CAN-FD and the
changes on the Linux gs_usb driver will be mainlined soon.

This patch adds the new GS_CAN_MODE_*, GS_CAN_FEATURE_*,
GS_USB_BREQ_DATA_BITTIMING, and GS_CAN_FLAG_* bits as well as struct
gs_host_frame_canfd to the candlelight driver.

This is mainly for documentation purpose, as the STM32F042 and
STM32F072 don't support CAN-FD. But there are some ports to CAN-FD
capable STM32 µC that can make use of these definitions.

[1] linklayer/gs_usb_fd#2
@pfink-christ
Copy link
Author

@marckleinebudde your latest version is currently not online somewhere, is it?

I guess you are going to send the patches upstream when you are ready and when you have time for it? Or are there still open issues that I could help with? Or maybe some more testing,...?

@marckleinebudde
Copy link

marckleinebudde commented Jan 26, 2022

Hey @pfink-christ,

I just pushed my latest work-in-progress to: https://github.com/marckleinebudde/linux/tree/gs-usb-fd

EDIT: the driver is now on it's way to upstream.

I had to split the hf_size into hf_size_rx and hf_size_tx, as the TX buffers are used for all interfaces. This makes problems if you start one interface in CAN-FD mode and the other in classical CAN mode. And I had to implement the GS_USB_BREQ_QUIRK_OVERLAP_DATA_BITTIMING for the CANtact Pro: marckleinebudde/linux@2c19410
I still have to clean up the URB error handling.

So testing would be good.

marckleinebudde pushed a commit to marckleinebudde/linux that referenced this issue Feb 3, 2022
CANtact Pro from Linklayer is the first gs_usb compatible device
supporting CAN-FD with a different HW and re-written candlelight FW.

Support for CAN-FD is indicated by the device setting the
GS_CAN_FEATURE_FD flag. CAN-FD support is requested by the driver with
the GS_CAN_MODE_FD flag. The CAN-FD specific data bit timing
parameters are set with the GS_USB_BREQ_DATA_BITTIMING control
message.

This patch is based on the Eric Evenchick's gs_usb_fd driver (which
itself is a fork of gs_usb). The gs_usb_fd code base was reintegrated
into the gs_usb driver, and reworked to not break the existing
classical-CAN only hardware.

Link: linklayer/gs_usb_fd#2
Co-developed-by: Eric Evenchick <[email protected]>
Signed-off-by: Eric Evenchick <[email protected]>
Signed-off-by: Peter Fink <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
marckleinebudde pushed a commit to marckleinebudde/linux that referenced this issue Feb 4, 2022
CANtact Pro from Linklayer is the first gs_usb compatible device
supporting CAN-FD with a different HW and re-written candlelight FW.

Support for CAN-FD is indicated by the device setting the
GS_CAN_FEATURE_FD flag. CAN-FD support is requested by the driver with
the GS_CAN_MODE_FD flag. The CAN-FD specific data bit timing
parameters are set with the GS_USB_BREQ_DATA_BITTIMING control
message.

This patch is based on the Eric Evenchick's gs_usb_fd driver (which
itself is a fork of gs_usb). The gs_usb_fd code base was reintegrated
into the gs_usb driver, and reworked to not break the existing
classical-CAN only hardware.

Link: linklayer/gs_usb_fd#2
Co-developed-by: Eric Evenchick <[email protected]>
Signed-off-by: Eric Evenchick <[email protected]>
Signed-off-by: Peter Fink <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
marckleinebudde pushed a commit to marckleinebudde/linux that referenced this issue Feb 6, 2022
CANtact Pro from Linklayer is the first gs_usb compatible device
supporting CAN-FD with a different HW and re-written candlelight FW.

Support for CAN-FD is indicated by the device setting the
GS_CAN_FEATURE_FD flag. CAN-FD support is requested by the driver with
the GS_CAN_MODE_FD flag. The CAN-FD specific data bit timing
parameters are set with the GS_USB_BREQ_DATA_BITTIMING control
message.

This patch is based on the Eric Evenchick's gs_usb_fd driver (which
itself is a fork of gs_usb). The gs_usb_fd code base was reintegrated
into the gs_usb driver, and reworked to not break the existing
classical-CAN only hardware.

Link: linklayer/gs_usb_fd#2
Co-developed-by: Eric Evenchick <[email protected]>
Signed-off-by: Eric Evenchick <[email protected]>
Signed-off-by: Peter Fink <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
marckleinebudde pushed a commit to marckleinebudde/linux that referenced this issue Feb 15, 2022
CANtact Pro from Linklayer is the first gs_usb compatible device
supporting CAN-FD with a different HW and re-written candlelight FW.

Support for CAN-FD is indicated by the device setting the
GS_CAN_FEATURE_FD flag. CAN-FD support is requested by the driver with
the GS_CAN_MODE_FD flag. The CAN-FD specific data bit timing
parameters are set with the GS_USB_BREQ_DATA_BITTIMING control
message.

This patch is based on the Eric Evenchick's gs_usb_fd driver (which
itself is a fork of gs_usb). The gs_usb_fd code base was reintegrated
into the gs_usb driver, and reworked to not break the existing
classical-CAN only hardware.

Link: linklayer/gs_usb_fd#2
Co-developed-by: Eric Evenchick <[email protected]>
Signed-off-by: Eric Evenchick <[email protected]>
Signed-off-by: Peter Fink <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
@BennyEvans
Copy link

Thanks again for all your work on this @marckleinebudde @pfink-christ. I've been away for the past few weeks but will will jump in over the next few days and perform some further testing.

@marckleinebudde
Copy link

FYI: I've send a pull request to net-next/master with CAN-FD support https://lore.kernel.org/all/[email protected]/

fengguang pushed a commit to 0day-ci/linux that referenced this issue Mar 10, 2022
CANtact Pro from Linklayer is the first gs_usb compatible device
supporting CAN-FD with a different HW and re-written candlelight FW.

Support for CAN-FD is indicated by the device setting the
GS_CAN_FEATURE_FD flag. CAN-FD support is requested by the driver with
the GS_CAN_MODE_FD flag. The CAN-FD specific data bit timing
parameters are set with the GS_USB_BREQ_DATA_BITTIMING control
message.

This patch is based on the Eric Evenchick's gs_usb_fd driver (which
itself is a fork of gs_usb). The gs_usb_fd code base was reintegrated
into the gs_usb driver, and reworked to not break the existing
classical-CAN only hardware.

Link: https://lore.kernel.org/all/[email protected]
Link: linklayer/gs_usb_fd#2
Co-developed-by: Eric Evenchick <[email protected]>
Signed-off-by: Eric Evenchick <[email protected]>
Signed-off-by: Peter Fink <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
@BennyEvans
Copy link

Awesome! Thanks for all of your work on this @marckleinebudde @pfink-christ. Appreciate it.

@KhazAkar
Copy link

So... What's the current status of upstreaming this driver?

@marckleinebudde
Copy link

It's mainline since v5.18.

@KhazAkar
Copy link

@marckleinebudde thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants