Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix actions feedback race #2612

Open
wants to merge 6 commits into
base: rolling
Choose a base branch
from

Conversation

mauropasse
Copy link
Collaborator

Fixes issue mentioned in #2451.

When executing the ActionClient, actions feedback & status had predominance over the action results, so if the client spins slower than the server feedback's rate, it will never process the action result (it will be busy processing feedbacks).

Also adding unit test to verify the fix works. The unit test also tests many other actions interactions.

Mauro Passerino added 2 commits August 27, 2024 12:19
Signed-off-by: Mauro Passerino <[email protected]>
 - See ros2#2451

Signed-off-by: Mauro Passerino <[email protected]>
rclcpp_action/test/test_actions.cpp Outdated Show resolved Hide resolved
rclcpp_action/test/test_actions.cpp Outdated Show resolved Hide resolved
rclcpp_action/test/test_actions.cpp Outdated Show resolved Hide resolved
rclcpp_action/test/test_actions.cpp Outdated Show resolved Hide resolved
rclcpp_action/test/test_actions.hpp Outdated Show resolved Hide resolved
rclcpp_action/test/test_actions.hpp Show resolved Hide resolved
rclcpp_action/test/test_actions.hpp Show resolved Hide resolved
rclcpp_action/test/test_actions.hpp Show resolved Hide resolved
rclcpp_action/test/test_actions.hpp Show resolved Hide resolved
rclcpp_action/test/test_actions.cpp Outdated Show resolved Hide resolved
@fujitatomoya
Copy link
Collaborator

this PR includes #2471 with latest code base, so i will close #2471.

rclcpp_action/src/client.cpp Show resolved Hide resolved
rclcpp_action/test/test_actions.hpp Outdated Show resolved Hide resolved
rclcpp_action/CMakeLists.txt Outdated Show resolved Hide resolved
rclcpp_action/test/test_actions.cpp Outdated Show resolved Hide resolved
@jmachowinski
Copy link
Contributor

Hard no go from my side. This introduces a new bug, that you would not receive all feedbacks, before the result.
See
rclcpp_action.TestClientAgainstServer.async_send_goal_with_feedback_callback_wait_for_result
for more details.

@fujitatomoya
Copy link
Collaborator

even though https://design.ros2.org/articles/actions.html#clientserver-interaction-examples does not specify that feedback messages needs to be delivered before result, all examples tell me result comes to the client after feedback messages. besides, if the feedback message comes after the result response, it is strange behavior, i am not sure what application is supposed to process this message in the callback.

and i think, this action design and requirement is already broken with current implementation. because rmw uses different channels for feedback and result, the messages are not queued in order, that means there is always possibility that feedback messages would come after result response. (by changing the order to take the data, it can mitigate a bit but not a perfect solution...)

IMO, if we can make it better for user-experience, changing the order would be acceptable? maybe the order needs to be reconsidered well ? (goal response->cancel response->feedback->status->result response)

any thoughts? @mauropasse @jmachowinski @ahcorde @alsora

@mauropasse
Copy link
Collaborator Author

My thoughts are that ROS 2 actions' feedback and status messages are useful during the action's execution, as they allow the user to understand how the process is progressing.

However, the responses (goal/cancel/result) are the final pieces of information when an action has completed, and these should be considered the most important. Feedback and status messages received by the client after the action has finished could (should?) be ignored if the user already knows the action's outcome.

This is why I think responses should be prioritized over feedback. Moreover, a response is sent only once, whereas there can be a large number of feedback messages.

I also want to point out that this issue affects only the single-threaded executor. It does not impact the events executor, as events are processed in the order they are generated.

Also I'm unsure about judging the correctness for the new proposed priority of execution, based on the results of previous tests that fail now?

@fujitatomoya
Copy link
Collaborator

@mauropasse thanks for your comment!

Feedback and status messages received by the client after the action has finished could (should?) be ignored if the user already knows the action's outcome.

This is i am not sure yet by design. To be honest, i thought that is okay feedback and status messages would come result (or even cancel), and either ActionClient or application can ignore that. Let's wait for more feedback on this.

Also I'm unsure about judging the correctness for the new proposed priority of execution, based on the results of previous tests that fail now?

I believe that @jmachowinski just wanted to confirm this behavior just like me. I think that is totally fine to change the test once behavior is changed.

@jmachowinski
Copy link
Contributor

I checked the initial bug report, and must say, the test itself is highly flawed.

What is comes down to is you got a provider running with higher frequency than the consumer is processing.

From my point of view the bug report makes wrong assumptions as to how spin_some works / should work. To be fair, the documentation of the function is misleading, as you need real deep knowledge of the executor internals so know what 'Collect all work' really means. The obvious fix to the problem is to use spin_all.

As to the action code in general, as I stated before I think the design is highly flawed. But I don't see a 'simple' fix for the issue, like the one proposed here.

As to the importance of receiving the last feedback before the goal, I agree with @mauropasse that in a (our) real world application, one can normally ignore feedback and it's not important at all. The problem with this change though is, that it will break the tutorial
https://docs.ros.org/en/jazzy/Tutorials/Intermediate/Writing-an-Action-Server-Client/Cpp.html and possible break user code, that rely on this behavior.

Mauro Passerino added 2 commits September 18, 2024 07:38
Signed-off-by: Mauro Passerino <[email protected]>
Signed-off-by: Mauro Passerino <[email protected]>
@mauropasse
Copy link
Collaborator Author

In my last commits I addressed comments from this PR.
I ended up lowering only the Feedback priority, so:

  • Feedback > Status > Goal Response > Result Response > Cancel Response (original)
  • Status > Goal Response > Result Response > Cancel Response > Feedback (new priority)

In this way:

  • Test modifications on test_client.cpp for it to pass, are minimal.
  • The demo ros2/demos/action_tutorials/action_tutorials_cpp/src/fibonacci_* has same behavior as before.
  • There's still the risk of breaking user code.

Signed-off-by: Mauro Passerino <[email protected]>
Copy link
Collaborator

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, one comment.

rclcpp_action/test/test_client.cpp Outdated Show resolved Hide resolved
@fujitatomoya
Copy link
Collaborator

@mauropasse can you also resolve @ahcorde 's comments?

@mauropasse mauropasse requested a review from ahcorde November 1, 2024 18:38
@fujitatomoya
Copy link
Collaborator

@jmachowinski since you have commented this, what do you think of the current change?

@jmachowinski
Copy link
Contributor

jmachowinski commented Nov 4, 2024

I don't like this change. It does not really fix the problem itself, but instead fixes it for one obscure case.

I wonder, if we can be smarter on the return of the next ready event instead of returning it in a fixed order.

Do we have access to the receive timestamp at that point ?
Returning the event with the oldest timestamp seems a way better solution...

Or could be introduce a sequence number in the messages and use this one to return the next ready event ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants