Wayland event polling crash when VSync is disabled, with message "explicit sync is used, but no release point is set" #93669

akien-mga · 2024-06-27T16:29:44Z

Tested versions

Reproducible in 4.3.beta (cae2f85)
Introduced in 4.3.beta1, not reproducible in 4.3.dev6 and earlier

System information

Fedora Linux 40 (KDE Plasma) - Wayland - Vulkan (Forward+) - dedicated AMD Radeon RX 7600M XT (RADV NAVI33) - AMD Ryzen 7 7840HS w/ Radeon 780M Graphics (16 Threads)

Issue description

A project using Wayland and with VSync disabled seems to segfault shortly after starting.

Here is the config in project.godot:

[display]

display_server/driver.linuxbsd="wayland"
window/vsync/vsync_mode=0

And the stacktrace:

wp_linux_drm_syncobj_surface_v1@61: error 3: explicit sync is used, but no buffer is attached
ERROR: Wayland protocol error 3 on interface wp_linux_drm_syncobj_surface_v1@61.
   at: _poll_events_thread (platform/linuxbsd/wayland/wayland_thread.cpp:2740)

Thread 21 "godot-git" received signal SIGILL, Illegal instruction.
[Switching to Thread 0x7fffdd8006c0 (LWP 142689)]
0x0000000005ba46a4 in WaylandThread::_poll_events_thread (p_data=0xbba9700) at platform/linuxbsd/wayland/wayland_thread.cpp:2740
2740                                    CRASH_NOW_MSG(vformat("Wayland protocol error %d on interface %s@%d.", error_code, wl_interface ? wl_interface->name : "unknown", id));

(gdb) bt
#0  0x0000000005ba46a4 in WaylandThread::_poll_events_thread (p_data=0xbba9700) at platform/linuxbsd/wayland/wayland_thread.cpp:2740
#1  0x000000000a5fa180 in Thread::callback (p_caller_id=21, p_settings=..., p_callback=0x5ba4490 <WaylandThread::_poll_events_thread(void*)>, p_userdata=0xbba9700) at ./core/os/thread.cpp:64
#2  0x000000000a60fd43 in std::__invoke_impl<void, void (*)(unsigned long, Thread::Settings const&, void (*)(void*), void*), unsigned long, Thread::Settings, void (*)(void*), void*>
    (__f=@0xbbb5de8: 0xa5fa0f2 <Thread::callback(unsigned long, Thread::Settings const&, void (*)(void*), void*)>) at /usr/include/c++/14/bits/invoke.h:61
#3  0x000000000a60fc0c in std::__invoke<void (*)(unsigned long, Thread::Settings const&, void (*)(void*), void*), unsigned long, Thread::Settings, void (*)(void*), void*>
    (__fn=@0xbbb5de8: 0xa5fa0f2 <Thread::callback(unsigned long, Thread::Settings const&, void (*)(void*), void*)>) at /usr/include/c++/14/bits/invoke.h:96
#4  0x000000000a60fac7 in std::thread::_Invoker<std::tuple<void (*)(unsigned long, Thread::Settings const&, void (*)(void*), void*), unsigned long, Thread::Settings, void (*)(void*), void*> >::_M_invoke<0ul, 1ul, 2ul, 3ul, 4ul> (this=0xbbb5dc8) at /usr/include/c++/14/bits/std_thread.h:301
#5  0x000000000a60e5fa in std::thread::_Invoker<std::tuple<void (*)(unsigned long, Thread::Settings const&, void (*)(void*), void*), unsigned long, Thread::Settings, void (*)(void*), void*> >::operator()
    (this=0xbbb5dc8) at /usr/include/c++/14/bits/std_thread.h:308
#6  0x000000000a60d572 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(unsigned long, Thread::Settings const&, void (*)(void*), void*), unsigned long, Thread::Settings, void (*)(void*), void*> > >::_M_run (this=0xbbb5dc0) at /usr/include/c++/14/bits/std_thread.h:253
#7  0x000000000b080fc4 in execute_native_thread_routine ()
#8  0x00007ffff7d621b7 in start_thread () at /lib64/libc.so.6
#9  0x00007ffff7de43cc in clone3 () at /lib64/libc.so.6

Steps to reproduce

On Linux, enable prefer_wayland and disable Vsync
Run an empty scene

Minimal reproduction project (MRP)

test12.zip

The text was updated successfully, but these errors were encountered:

Riteo · 2024-06-27T17:15:58Z

All right, so, after a short panic, I think I got it.

Funnily enough, another user reported this to me in a more exotic usecase and I, out of experience, shrugged it off as a driver bug (oh was I naive). Note that they used vsync, so this is certainly not a vsync issue.

It looks like putting wl_surface_commit inside some asynchronous methods (namely WaylandThread::_frame_callback_on_done) made a threading issue evident, after the huge "explicit sync revolution" (apparently also firefox has this issue¹).

If, due to multithreading, the command gets queued while an explicit sync is getting set up and before a buffer is set, the no_buffer error rightfully shows up and it explodes.

This apparently got also noticed while developing the new protocol² and it has been concluded as misuse of the protocol by clients like us, so we ought to fix that by limiting commits to at least the same thread (ideally once by the renderer). I already tried reducing commits before, but wasn't able to do so because I forgot about low-consumption mode (my archnemesis).

I think I got a quick patch that should remove even more unneeded commits and sync the only one needed for frame timeout to the main thread . It's not pretty, but if it works we can look to perhaps only spam-commit if in low-usage mode or only when checking, not sure. (this should get only called each frame anyways so whatever).

What?

If all of that flew over the reader's head, don't worry, I'm still a bit panicking :gdsweat: . Please try this patch:

UPDATE: There's a PR now which is a bit more complex: #93684

diff --git a/platform/linuxbsd/wayland/display_server_wayland.cpp b/platform/linuxbsd/wayland/display_server_wayland.cpp
index 3fad8c2987..363c14bbcd 100644
--- a/platform/linuxbsd/wayland/display_server_wayland.cpp
+++ b/platform/linuxbsd/wayland/display_server_wayland.cpp
@@ -1192,6 +1192,7 @@ void DisplayServerWayland::process_events() {
 
 	wayland_thread.keyboard_echo_keys();
 
+	wl_surface_commit(wayland_thread.window_get_wl_surface(MAIN_WINDOW_ID));
 	if (!suspended) {
 		if (emulate_vsync) {
 			// Due to various reasons, we manually handle display synchronization by
diff --git a/platform/linuxbsd/wayland/wayland_thread.cpp b/platform/linuxbsd/wayland/wayland_thread.cpp
index 63a8db07df..02b204f6be 100644
--- a/platform/linuxbsd/wayland/wayland_thread.cpp
+++ b/platform/linuxbsd/wayland/wayland_thread.cpp
@@ -968,7 +968,6 @@ void WaylandThread::_frame_wl_callback_on_done(void *data, struct wl_callback *w
 
 	ws->frame_callback = wl_surface_frame(ws->wl_surface),
 	wl_callback_add_listener(ws->frame_callback, &frame_wl_callback_listener, ws);
-	wl_surface_commit(ws->wl_surface);
 
 	if (ws->wl_surface && ws->buffer_scale_changed) {
 		// NOTE: We're only now setting the buffer scale as the idea is to get this
@@ -3241,10 +3240,6 @@ void WaylandThread::window_create(DisplayServer::WindowID p_window_id, int p_wid
 	ws.frame_callback = wl_surface_frame(ws.wl_surface);
 	wl_callback_add_listener(ws.frame_callback, &frame_wl_callback_listener, &ws);
 
-	// NOTE: This commit is only called once to start the whole frame callback
-	// "loop".
-	wl_surface_commit(ws.wl_surface);
-
 	if (registry.xdg_exporter) {
 		ws.xdg_exported = zxdg_exporter_v1_export(registry.xdg_exporter, ws.wl_surface);
 		zxdg_exported_v1_add_listener(ws.xdg_exported, &xdg_exported_listener, &ws);

Note: I've not tested the above as I don't have an explicit sync setup (yet), I'm working on getting one ASAP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wayland event polling crash when VSync is disabled, with message "explicit sync is used, but no release point is set" #93669

Wayland event polling crash when VSync is disabled, with message "explicit sync is used, but no release point is set" #93669

akien-mga commented Jun 27, 2024

Riteo commented Jun 27, 2024 •

edited

Loading

Riteo commented Jun 27, 2024

Riteo commented Jun 27, 2024

Wayland event polling crash when VSync is disabled, with message "explicit sync is used, but no release point is set" #93669

Wayland event polling crash when VSync is disabled, with message "explicit sync is used, but no release point is set" #93669

Comments

akien-mga commented Jun 27, 2024

Tested versions

System information

Issue description

Steps to reproduce

Minimal reproduction project (MRP)

Riteo commented Jun 27, 2024 • edited Loading

What?

Footnotes

Riteo commented Jun 27, 2024

Riteo commented Jun 27, 2024

Riteo commented Jun 27, 2024 •

edited

Loading