Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rerun 0.21.0 WGPU Error on WSL #8564

Open
andrearosasco opened this issue Dec 30, 2024 · 6 comments · May be fixed by #8610
Open

rerun 0.21.0 WGPU Error on WSL #8564

andrearosasco opened this issue Dec 30, 2024 · 6 comments · May be fixed by #8610
Assignees
Labels
🪳 bug Something isn't working 💣 crash crash, deadlock/freeze, do-no-start 🔺 re_renderer affects re_renderer itself VM / WSL / Docker Issue happening only in a virtualized or dockerized environment
Milestone

Comments

@andrearosasco
Copy link

Describe the bug
On WSL switching from rerun v0.18.0 to v0.21.0 causes the following error to appear when the visualizer is launched:

[2024-12-30T13:09:13Z ERROR wgpu_core::device::resource] indirect-validation error: ComputePipeline(Internal("The selected version doesn't support Features(COMPUTE_SHADER | DYNAMIC_ARRAY_SIZE)"))
[2024-12-30T13:09:13Z ERROR eframe::native::run] Exiting because of error: WGPU error: Parent device is lost
Error: WGPU error: Parent device is lost

Launching rerun with WGPU_BACKEND=vulkan rerun works but the performance is poor

To Reproduce
Steps to reproduce the behavior:

  1. pip install rerun-sdk==0.21.0
  2. rerun

Expected behavior
The visualizer starts without any errors

Desktop (please complete the following information):
Ubuntu 22.04.4 (WSL) on Windows 11

Rerun version
0.21.0

Additional Context
The error does not appear in rerun 0.18.0

@andrearosasco andrearosasco added 👀 needs triage This issue needs to be triaged by the Rerun team 🪳 bug Something isn't working labels Dec 30, 2024
@Wumpf Wumpf added 🔺 re_renderer affects re_renderer itself 💣 crash crash, deadlock/freeze, do-no-start VM / WSL / Docker Issue happening only in a virtualized or dockerized environment and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Jan 2, 2025
@Wumpf Wumpf self-assigned this Jan 2, 2025
@Wumpf
Copy link
Member

Wumpf commented Jan 2, 2025

A bunch of issues are conspiring here: Turns out we rely on the GL->DX12 forwarding driver that wsl installs by default. This breaking in turn is a bit more complicated and I created a bug here

Ideally, Vulkan would just work! Out of the box there's no forwarding Vulkan driver, but mesa can be updated to achieve this!

sudo add-apt-repository ppa:kisak/kisak-mesa
sudo apt update
sudo apt upgrade # todo: what should be upgraded exactly?

But this actually makes it worse: now we end up with non-conformant Vulkan drivers which wgpu filters out and a software rasterizer for GL (which it a least works)

-> Next steps:

  • try getting Vulkan drivers to work by not filtering non-conformant Vulkan drivers
  • try getting out-of-the-box GL working again by disabling downlevel feature flag
  • patch wgpu as proposed in linked issue

@Wumpf
Copy link
Member

Wumpf commented Jan 2, 2025

try getting Vulkan drivers to work by not filtering non-conformant Vulkan drivers

that's trickier than expected. We need extended wgpu setup options for this since this is an instance flag emilk/egui#5506

try getting out-of-the-box GL working again by disabling downlevel feature flag

Can't be done: these downlevel flags are capability flags, not feature flags. I.e. we can't turn it off actually.

So from the looks of it we can't work around there here and need to patch wgpu instead.

@Wumpf
Copy link
Member

Wumpf commented Jan 2, 2025

All of the above makes it very unlikely we can solve this in a patch release. But we should at the very least have this fixed on 0.22!

Note that 0.20 still works fine.

@Wumpf Wumpf added this to the 0.22 - ? milestone Jan 2, 2025
@Wumpf
Copy link
Member

Wumpf commented Jan 5, 2025

Figured out by now that there's even more issues here with the GL renderer: It advertises R32f as not renderable. We hard fail when that happens now because this is such a basic feature and may case picking to stop working down the line (in actuality it is only used iff we have to do a depth-reading workaround, tbh I'm not sure if that's actually active in this particular scenario.. so we could get away with it. But then again drawing to R32F is imho so fundamental we'll likely crash in the future over it anyways)

I also observed this on a fresh ubuntu 24 wsl. Note to self, here's the steps I did:

  • snap rustup, sudo apt install build-essential
  • in wgpu checkout cargo run -p wgpu-info out of the box sees vulkan llvmpip & gl 4.6 (note that older distros advertise 4.2)
  • error: WaylandError(Connection(NoCompositor)) }) for examples ...
  • WAYLAND_DISPLAY= comes back with needing sudo apt install libxkbcommon-x11-0. After that WAYLAND_DISPLAY= cargo run -p wgpu-examples -- hello_triangle works
  • WAYLAND_DISPLAY= rerun --renderer vulkan works
  • .. but the gl renderer doesn't because it can't draw R32F
  • .... run cargo run -p wgpu-info -- vv to confirm this constellation

@andrearosasco
Copy link
Author

Hey @Wumpf thanks for looking into the issue! While waiting for the next rerun version do you think I could modify my environment (e.g. installing a different version of the gl driver) to make it compatible with the current rerun version?

@Wumpf
Copy link
Member

Wumpf commented Jan 7, 2025

@andrearosasco I poked a little bit into that direction and install newer mesa drivers, but not too much avail so far, see #8564 (comment). The gpu driver ('passthrough') driver situation on WSL is really not that great.
Once emilk/egui#5506 is all the way through we can try enabling non-conformant drivers and see if that works then.

Otherwise my recommendation (as always when it comes to WSL!) is to run the viewer on the host Windows system and do only sdk invocations from inside WSL, connecting to the host's viewer.

@Wumpf Wumpf linked a pull request Jan 7, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 💣 crash crash, deadlock/freeze, do-no-start 🔺 re_renderer affects re_renderer itself VM / WSL / Docker Issue happening only in a virtualized or dockerized environment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants