Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intermittent segfault during automated tests #169

Open
smcv opened this issue Sep 11, 2024 · 2 comments
Open

intermittent segfault during automated tests #169

smcv opened this issue Sep 11, 2024 · 2 comments

Comments

@smcv
Copy link
Contributor

smcv commented Sep 11, 2024

While trying to reproduce #166 interactively, I encountered a segfault in the tests.

To reproduce:

  • mkdir _build
  • podman run --rm -it -w $(pwd) -v $(pwd):$(pwd):ro -v $(pwd)/_build:$(pwd)/_build:rw debian:sid-slim
  • in the container:
    • sed -i -e 's/Types:.*/Types: deb deb-src/' /etc/apt/sources.list.d/debian.sources
    • apt update
    • apt upgrade
    • apt build-dep libportal
    • meson setup _build
    • meson compile -C _build
    • meson test -C _build --timeout-multiplier=3 --repeat=20 pytest

Backtrace:

#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0)
    at ./nptl/pthread_kill.c:44
#1  0x00007ff323b3347f in __pthread_kill_internal (signo=11, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007ff323ae4532 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#3  0x00007ff323ae45d0 in <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007ff3218aaa55 in g_task_get_cancellable (task=0x5f6b636f6d737562) at ../../../gio/gtask.c:1275
#5  0x00007ff3216d0610 in call_returned (object=0x2b32cdd0 [GDBusConnection], result=0x2b2b5fe0, data=0x2b341340)
    at ../libportal/inputcapture.c:284
#6  0x00007ff3218aa393 in g_task_return_now (task=task@entry=0x2b2b5fe0 [GTask]) at ../../../gio/gtask.c:1361
#7  0x00007ff3218ab033 in g_task_return (type=<optimized out>, task=0x2b2b5fe0 [GTask]) at ../../../gio/gtask.c:1430
#8  g_task_return (task=0x2b2b5fe0 [GTask], type=<optimized out>) at ../../../gio/gtask.c:1387
#9  0x00007ff321908ce0 in g_dbus_connection_call_done
    (source=0x2b32cdd0 [GDBusConnection], result=<optimized out>, user_data=0x2b2b5fe0)
    at ../../../gio/gdbusconnection.c:6344
#10 0x00007ff3218aa393 in g_task_return_now (task=task@entry=0x2b2f0c50 [GTask]) at ../../../gio/gtask.c:1361
#11 0x00007ff3218aa3cd in complete_in_idle_cb (task=0x2b2f0c50) at ../../../gio/gtask.c:1375
#12 0x00007ff3222447df in g_main_dispatch (context=context@entry=0x2b3628f0) at ../../../glib/gmain.c:3357
#13 0x00007ff322246a17 in g_main_context_dispatch_unlocked (context=0x2b3628f0) at ../../../glib/gmain.c:4208
#14 g_main_context_iterate_unlocked
    (context=0x2b3628f0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
    at ../../../glib/gmain.c:4273
#15 0x00007ff32224746f in g_main_loop_run (loop=0x2b29ebf0) at ../../../glib/gmain.c:4475
#16 0x00007ff3235953fe in ffi_call_unix64 () at ../src/x86/unix64.S:104
#17 0x00007ff32359470d in ffi_call_int
    (cif=cif@entry=0x2b295828, fn=<optimized out>, rvalue=<optimized out>, avalue=<optimized out>, closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#18 0x00007ff323594ee3 in ffi_call
    (cif=cif@entry=0x2b295828, fn=<optimized out>, rvalue=rvalue@entry=0x7fffa3321c78, avalue=<optimized out>)
    at ../src/x86/ffi64.c:710
#19 0x00007ff322c1ecf3 in pygi_invoke_c_callable
    (function_cache=<optimized out>, state=<optimized out>, py_args=<optimized out>, py_kwargs=<optimized out>)
    at ../gi/pygi-invoke.c:684

and remaining stack frames are CPython.

It looks like a use-after-free of the Call: all of its pointer members point to inaccessible memory.

@smcv
Copy link
Contributor Author

smcv commented Sep 11, 2024

I think what's happening here is that if the Response signals successful completion before the g_dbus_connection_call that created it has returned, then the Call is freed while it is still the user_data for g_dbus_connection_call. In call_returned(), if g_dbus_connection_call_finish succeeds, then we don't actually dereference the dangling pointer and it's OK (but probably still undefined behaviour); but if g_dbus_connection_call_finish somehow fails (perhaps because it was cancelled or because we disconnect from the bus or something?), we do dereference the pointer to freed memory.

Instead of using the Call as the user data, I think it might be better to use the GTask (which is refcounted) as the user data, and attach the rest of the Call members to it via g_task_set_task_data().

@smcv
Copy link
Contributor Author

smcv commented Sep 11, 2024

I tried inserting some debug logging in functions that interact with the Call and using pytest -s -x (don't capture stdout/stderr, do fail as soon as one test-case fails) to be able to see the log output, but unfortunately that perturbs the timing enough that I can no longer reproduce the segfault, so I'm guessing at the root cause based on the output I saw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant