-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rockchip64-edge: add rkvdec2 for rk356x #6804
Conversation
I wonder why such big CMA windows is needed for rkvdec2; I mean, rkvdec v1 AFAIK does not require CMA because the device has its own MMU and can freely access kernel memory. I don't know if something changed for rkvdec2, but such a regression would surely would disappoint me. I can decode 4K content on 32 bit rk3228 with 1Gb devices and just 16Mb CMA window. CMA turns as "reserved" memory to the linux kernel and therefore it cannot be used by userland processes. Since the rockchip64 kernel also applies to smaller rk3328 devices, reserving 384M for those would hamper them. As said, everything could be changed for rkvdec2, but please verify if it is really needed. |
This 384M CMA size is copied from rk3588's kernel config. At first I did not touch it but when playing 4K video with chromium there was cma not enough error from dmesg, and chromium failed back to software decoding, so I just make CMA size same as rk3588's kernel. Maybe 384M CMA size is not the minimal requirement, I can check it. But if I recall it correctly the upcoming rk3588's hdmirx driver also need large CMA size: https://github.com/armbian/linux-rockchip/blob/rk-5.10-rkr6/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts#L210, I need 384M for 4K hdmi in capture with downstream kernel. If we have a plan to merge rk3588 family to rockchip64 eventually, we have to increase the CMA size. |
Yeah this is what I was wondering as well. I don't know enough about this to know if this will "steal" 384MB of RAM to reserve for the VPU or GPU. But if that's the case, that would be a big deal. I mean, just look at the Raspberry Pi kernel disabling an otherwise good/positive kernel config option to fix an issue that has already been fixed in mainline, just to save 1MB of RAM: raspberrypi/linux#5975 (comment) If there are Rockchip64 devices with only like 2GB of RAM, 384MB is a huge chunk. Maybe this issue should be reported to mainline or the people developing rkvdec2. If they are already aware, maybe we should wait until it's fixed? |
From what I've read the CMA pool can be set in the device tree, as a kernel boot parameter or in the kernel config (and the order of presidince is in that order). So it seems that there are some options on how and where this could be implemented so that it can vary by board, so low memory devices could go with a lower value. |
From my test 192M cma is enough for rkvdec2, and 184M is not enough.
I tried to add 192M reserved cma to devicetree with this overlay:
But it seems that this just add an additional 192M cma memory:
184 MiB cma is set in kernel config, and total cma memory size 385024K(376M) is 192M + 184M. |
@amazingfate did you try to reduce the size of the cma configured in the kernel to a small number (say 16 or 32 megabytes) or even disable it and declare a large cma (384M) in the device tree to see what happens? The intent of the cma is to be linearly addressable as well, if you have more smaller chunks it could be that the client can't allocate a large single chunk. |
While setting cma size with kernel cmdline works. And cma size 256M is okay for both gstreamer and chromium.
After setting cma size to 0 in kernel config, gstreamer will not get cma even there is 256M cma configured in device tree. It seems that v4l2 driver will always get cma set from kernel config or cmdline. |
I wonder how mainline does/will handle this CMA size 😄 |
I'm thinking of splitting bootenv for rk35xx socs to use larger cma size so old socs like rk3328 can still play with 128M reserved cma. |
I wonder why they are treated differently. |
This is indeed a better way. I will try that. |
Just for knowledge, but on my rk3318 box (which is totally similar to rk3328) I can achieve hardware video decoding with CMA completely disabled using
This is the mpv process and its threads cpu usage via top while streaming 1080p h.264 content from network:
and content is playing smooth, just with some occasional dropped frames while I tinker in the background. And here is dmesg: |
Found also another interesting thing about the device tree: the cma node must be specified with
Actually gives me 64mb of CMA memory:
If @amazingfate you may give a try to this |
TBH rockchip64 should be split into RK33 and RK35 so these interactions aren't problematic. Even that split is sill since RK3308 and RK3399 have nothing in common, but at least the mad scientist lab can be kept far away from stable hardware |
I wouldn't do that, not after struggling to keep the number of kernels under control... I mean, if you look at allwinner, 32 bits boards and 64 bits boards happily live together with the same patchset (with tons of patches, though) and just two kernels. The intent should be to merge things into current/edge kernels when the hardware is actually ready for mainline; mad scientist things should live in the mad scientists' own repository or branch; I see the comments above as "what to do to coexist", and the confrontation looks sane to me. It would ideal if people asks themselves "what other board or system I can break with this change?" (ie. take responsibility) before bringing in random patches that makes its hardware "just work". |
a32bf2d
to
5309391
Compare
5309391
to
12ad769
Compare
This method works! So I can just add this reserved memory node to rk356x.dtsi and no need to change the CMA size in kernel config. |
I find that CmaFree size is different when cma is set with kernel arg
Set with devicetree node:
When cma is set to 256M with devicetree, rkvdec2 driver will still failed with |
@amazingfate there is nothing to excuse about, on the contrary I find all of this very interesting. I wonder why rkvdec2 complaints and why CmaFree is different when set via device tree. Reading the (old) LWA article about CMA, it says that it is possible for the kernel to use the CMA-allocated memory for other uses like disk cache, but when a device driver requires it, the disk cache is promptly erased and the memory is given to the device driver that requires the chunk. During the above tests with the rk3328, there were also situations were CMA free memory lowered after playing some videos and toying around, and it didn't get reclaimed by the kernel after closing the supposed users, so I still don't understand those CmaTotal and CmaFree values. |
@amazingfate While reviewing the rockchip64 kernel config, I stumbled upon these "system heap" and "CMA heap" config options: Maybe it is worth enabling them and give it a shot? I know that have been enabled on libreelec for other rockchip platforms to let the video decoding work. |
I enabled |
ff15ff6
to
b072b85
Compare
b072b85
to
7b943fd
Compare
Update patches to 6.10, and add |
Tested with rock-3a, chromium can decode 4k 2160p h264 video with vpu. |
Description
We can also use rkvdec2 on rk356x.
rockchip,rk3568-vpu
torockchip,rk3328-vpu
because they have exactly the same vpu variant data.How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Please also note any relevant details for your test configuration.
Checklist:
Please delete options that are not relevant.