Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BtActionServer doesn't accept the message with warning about possible multiple action servers #3328

Closed
andriimaistruk opened this issue Dec 15, 2022 · 5 comments

Comments

@andriimaistruk
Copy link
Contributor

Bug report

Hi. For the custom action server built on top of nav2_behavior_tree::BtActionServer, which in a sense is a copy of the navigate_to_pose action server, the python client sometimes has trouble receiving the goal response from the server, the warning that appears is Ignoring unexpected goal response. There may be more than one action server for the action <custom_action_name>. And this warning doesn't make sense to me, as I am sure that there is only one action server running. Is it related to how the nav2_behavior_tree::BtActionServer handles goal incoming right after the previous one just finished its execution? Has anyone run into such a warning before?
Thanks!

Required Info:

  • Operating System:
    • Ubuntu 22.04 -->
  • ROS2 Version:
    • ros2 humble
  • Version or commit hash:
  • DDS implementation:

Steps to reproduce issue


Expected behavior

Actual behavior

Additional information


Feature request

Feature description

Implementation considerations

@SteveMacenski
Copy link
Member

SteveMacenski commented Dec 15, 2022

Where is that error coming from (nav2, ROS 2 actions, etc)? I can't find that string anywhere in the stack. I've never seen that before. Are you entirely sure that there's not multiple instances the action being created or multiple actions sharing a name by accident?

Can you replicate this with any non-custom action server? It may be your use of the API potentially

@andriimaistruk
Copy link
Contributor Author

This message is from https://github.com/ros2/rclpy/blob/galactic/rclpy/rclpy/action/client.py#L304.

Why I am sure that there is only one such action server is that this action server is run one per robot.
And the robots' ros networks are isolated using ROS_ID, so that robot A doesn't see the second action server for
the same action message running on robot B. So unless ROS_ID can sometimes leak, I am entirely sure that there're
no multiple instances of the action being created or multiple actions sharing a name by accident. (The actual warning
message is also not that definitive, "there MAY be more than one ...".)

The way it happened is that the action server finished one goal, and the second one was sent to it, and the action didn't start executing as the goal was not accepted, but the client which sent the goal didn't receive any feedback on that, only the above warning.

I haven't tried replicating it for a non-custom action server as it happens rarely but definitely will try to do it with scrutiny on API use.

@SteveMacenski
Copy link
Member

SteveMacenski commented Dec 16, 2022

If you look through the usage of it in Nav2, are you aligning well with that?

Happy to work through an issue with nav2, its software, or documentation missing to use libraries in other applications, but if we can’t reproduce or isolate to something in Nav2, I’m not sure how much we can do in the interim. But if we can get more information, I'd be happy to dig into it more on my side if there's something wrong with Nav2.

Debugging a little further would be great. I can't know given the current information if it is from the basic server to the Python Client API / usage - or anywhere in between. For example:

  • If you use a C++ action client instead of a python one, does it persist?
  • If you look closely at your server implementation based on the class, does it largely match the model / API use by the open-source ones? Are you calling termination at all exit conditions and does the spinning model make sense?
  • Can this be replicated with any other piece of software?
  • If you look through the base class, do you see anything that could cause this issue?

@andriimaistruk
Copy link
Contributor Author

Thank you very much for your help!
I can't yet reproduce it as I myself can't obtain enough debug information.
Hopefully, if it happens again, I will have the answer I was looking for.

@daisukes
Copy link
Contributor

daisukes commented Sep 9, 2023

I got the exact same warning when my system sent a goal to the BT navigator.
I found that this can happen if the executor of ActionClient and the caller of send_goal/send_goal_async are executed in different callback groups within a multithreaded executor. There is a race condition issue.

I am posting this for someone who gets the same error with Nav2.

The solution is to use the same callback group (or Single thread executor) for those (execution and send_goal call) or wait for a PR at rclpy.

The cause of the problem is well described here.
ros2/rclpy#1123

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants