-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rclcpp] C++ lib leaking rclcpp::Node()
can cause segfaults at dlclose()
#2567
Comments
Inline copy of README:Relates to Drake, Due to a workaround in pydrake RobotLocomotion/drake#14356, If we have components that use Example minimal code (from // Expected case: rclcpp::Node will be freed.
void InitNode() {
RclcppInit();
rclcpp::Node("node");
}
// Bad case: rclcpp::Node is leaked :(
void InitAndLeakNode() {
RclcppInit();
new rclcpp::Node("node");
} This shows three cases, running 200 trials, 50 in parallel
The ideal workaround is that OutputErrors w/ # Non-standard build, but just want to focus on issue.
$ source /opt/ros/humble/setup.bash
$ cmake -S . -B build
$ cmake --build build -j
$ python ./src/ros_segfault_test.py
[ direct_leak ]
num_fail: 0 / 200
returncodes: {0}
[ dlopen_noleak ]
num_fail: 0 / 200
returncodes: {0}
[ dlopen_leak ]
num_fail: 185 / 200
returncodes: {0, -11, 127}
./build/ros_segfault_min_dlopen: symbol lookup error: /opt/ros/humble/lib/x86_64-linux-gnu/libddsc.so.0: undefined symbol: ddsrt_avl_swap_node
./build/ros_segfault_min_dlopen: symbol lookup error: /opt/ros/humble/lib/x86_64-linux-gnu/libddsc.so.0: undefined symbol: ddsrt_ehh_new
./build/ros_segfault_min_dlopen: symbol lookup error: /opt/ros/humble/lib/x86_64-linux-gnu/libddsc.so.0: undefined symbol: ddsi_update_proxy_writer
./build/ros_segfault_min_dlopen: symbol lookup error: /opt/ros/humble/lib/x86_64-linux-gnu/libddsc.so.0: undefined symbol: ddsi_update_proxy_reader
./build/ros_segfault_min_dlopen: symbol lookup error: /opt/ros/humble/lib/x86_64-linux-gnu/libddsc.so.0: undefined symbol: plist_fini_generic
./build/ros_segfault_min_dlopen: symbol lookup error: /opt/ros/humble/lib/x86_64-linux-gnu/libddsc.so.0: undefined symbol: ddsrt_avl_free_arg Attaching GDB# Terminal 1: Run background stuff to incite segfault.
$ python ./src/ros_segfault_test.py --modes dlopen_leak --count 0
# Terminal 2: Run with gdb.
$ gdb --args ./build/ros_segfault_min_dlopen ./build/libros_segfault_min_lib.so InitAndLeakNode
# show backtrace on all threads
(gdb) thread apply all bt Example stacktrace from main thread:
|
Thanks for the detailed analysis and reproduction. My initial hunch would be that the Node is holding a function pointer to something that is getting unloaded before the library in question. Have you tried running this under |
I ran it under Perhaps |
Bug report
Required Info:
humble
rmw_cyclonedds_cpp
rclcpp
Steps to reproduce issue
See https://github.com/EricCousineau-TRI/rclcpp_dlclose_segfault
Expected behavior
dlclose()
does not trigger race conditions with threads -- ideally, shutdowns existing workers, or tries to orderatexit
operations manually.Actual behavior
dlclose()
appears to trigger race conditions when user leaksrclcpp::Node
.Additional information
See README in https://github.com/EricCousineau-TRI/rclcpp_dlclose_segfault
The text was updated successfully, but these errors were encountered: