Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress-bar for ros2 bag play #1836

Open
wants to merge 36 commits into
base: rolling
Choose a base branch
from

Conversation

nicolaloi
Copy link
Contributor

@nicolaloi nicolaloi commented Oct 15, 2024

Closes #1762: Adds a progress bar for ros2 bag play, showing the bag time and duration, similar to what is seen in ROS1.

In ROS1, the progress bar was easy to handle because the rosbag play command didn't print additional output during playback. However, in ROS2, many logs can be printed during playback (e.g., keyboard control logs).

The progress bar is printed and continuously updated on the last line of the playback output. I have tested this initial implementation with keyboard controls (play, pause, play next, set rate), Ctrl+C, throwing an exception, and the flags --log-level debug, --delay X, and --start-offset X.

To try it, use the new --progress-bar flag when playing a bag.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Oct 15, 2024

video-example.mp4

(at 2.6-2.7 seconds of bag duration, the progress bar is slightly updating while in paused state because I was clicking the left arrow for the play next keyboard control)

Since ros2 bag play generates more logs than its ROS1 counterpart, the key aspect of this PR is to print a string shorter than any potential log message not coming from rosbag2_player. For example, with the current draft of the progress bar, this prints nicely:

[INFO] [1729028986.845909253] [rosbag2_player]: Pausing play.
 Bag Time 1718553782.525239  Duration 1.272325/6.974378 [P]

while this longer version of the progress bar doesn't work properly without extra line clearing (in this case, it's easy to clear the line since the logs come from rosbag2_player., but for "external" logs, this becomes more difficult):

[INFO] [1729028986.845909253] [rosbag2_player]: Pausing play.us:Running]
 Bag Time 1718553782.525239  Duration 1.272325/6.974378 [status:Paused]

@MichaelOrlov looking forward to your feedback, for now I have made a test MVP code to check the feasibility of the feature.

Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from d503e5f to 2cb732d Compare October 15, 2024 22:24
@nicolaloi nicolaloi changed the title Start adding progress-bar for ros2 bag play [wip] Progress-bar for ros2 bag play Oct 16, 2024
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from b815f9e to 4ba1a8c Compare October 20, 2024 22:56
@nicolaloi nicolaloi changed the title [wip] Progress-bar for ros2 bag play Progress-bar for ros2 bag play Oct 21, 2024
Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nicolaloi Thank you for your contribution.
I have a few concerns about the design and implementation.
First, we need to limit the frequency with which we will print in std::cout. The best way to do this is to create a local variable, aka size_t progress_bar_update_freq_ (in Hz).
e.g. 1 Hz = once per/sec, 2 Hz = twice per second, 3 = three times per second etc..
Then, check the delay between the last call in the print_status(..). If the delay is more than the threshold, print; otherwise, skip.
We also need to move the check for the if (disable_progress_bar_) inside print_status(..) to make the logic cleaner and in one place.
Calling double current_time_secs = RCUTILS_NS_TO_S(static_cast<double>(clock_->now())); each time in the print_status(..) is very inefficient and expensive since we are locking mutex inside and transorming current time to the ROS time. We actually know the current ROS time at which we are playing. This is a timestamp from the message. I would suggest passing it to the print function as a parameter.
i.e., add tiemstamp parameter to the print_status(..) function and use message_ptr->recv_timestamp as a source of timestamp.

I found the implementation of the print_status(..) function a bit messy.
That is a quintessence of the modern C++17 with templates and old style plain C code.
Need to avoid using C code. Please use C++ alternatives. I am not sure if we really need to flush buffer each time. Flush is an expensive operation since this is a blocking system call. Need to avoid it if possible.

IMO templating print_status(..) function is overengineering and make code less readable and less understandable. The cleaner solution would be if using enum class values for different modes. In this case all extra wrappers like check_and_print_status() will not be needed. We will be able to call print_status(PrintStatus::Paused) directly from where we needed it.

rosbag2_transport/src/rosbag2_transport/player.cpp Outdated Show resolved Hide resolved
@nicolaloi
Copy link
Contributor Author

@MichaelOrlov thanks for the comments.

  • limit printing frequency, mutex and flushing: yes I already had a local change to allow the user to choose the update frequency from the CLI (default 10-20 Hz) to reduce the mutex locking and output stream flushing. But indeed using the timestamp of the last published message is better than calling clock_->now().
    -- The rationale of clock_->now() was to be able to have a consistent time flow if you want to print the progress bar at a fixed rate but messages are not close in time. For example, if the next message is 3 seconds in the future, for the next 3 seconds the progress bar time will appear frozen since it is printing the same timestamp. But yes in practice is a niche case.
    -- Regarding the flush, updating the progress bar without flushing will not update the printed time (or anyway it will get flushed automatically after some update calls, but there can be artifacts if other external ros logs are being printed). So instead of calling the progress bar without flushing, I would just reduce the calls to the progress bar printing with the already mentioned user-defined frequency update. I would say that 10-20 Hz can be a good default value? Or even lower to be conservative?
  • add timestamp parameter to the print_status: for this I would save it as a class attribute and use it within print_status instead of adding it as an input, since in many instances where print_status is being called there is no direct easy access to the last message.
  • print_status implementation: ok I will remove the template and use enums.

Should the progress bar be disabled by default, or enabled by default with a default low update frequency?

@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch 2 times, most recently from 253256a to 44fa94a Compare October 28, 2024 23:35
nicolaloi and others added 2 commits October 28, 2024 23:38
Signed-off-by: Nicola Loi <[email protected]>
Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from 44fa94a to 43fa0f8 Compare October 28, 2024 23:38
@nicolaloi
Copy link
Contributor Author

I have made changes as discussed.

Linked modifications:

I have modified the play_next function to accept a boolean input argument 'is_burst_mode' (default to false); so when play_next is called in burst mode, the print_status is not called at all, to avoid any delay (but now the progress bar will freeze during burst).

print_status will update the progress bar at the defined update frequency if the status is RUNNING; for the other states (or if the new boolean argument 'force_update' is true) it will update it disregarding the update frequency. This is to take care of the istances where a new log message from the rosbag player is being printed (like 'rate changed to 1.1', 'player paused', warnings..., anyway these are not high-frequency logging) to make the progress bar visible again. Otherwise, it will temporarily or permanently (permanently if the player gets paused) disappear since it is overwritten in the terminal by the previous log message.

@MichaelOrlov
Copy link
Contributor

@nicolaloi I have not had a chance yet to look deeply into the code after your latest changes. However, I have a few comments.

  1. I am ok with a flush; I anticipated that it is likely some constraints as you described.
  2. Making the progress bar enabled by default is a good idea. However, the frequency should be of integer type (size_t) in C++, and 10 Hz IMO is to fast. I would put 3-5 Hz at max by default.
  3. We don't need flag play_options.disable_progress_bar at all we can rely on the play_options.progress_bar_update rate. i.e. if it is 0 = disabled.

@nicolaloi
Copy link
Contributor Author

Ok, I'll set the default rate to 3 Hz.

For 2nd and 3rd points, there is no longer the --disable_progress_bar option, but a --progress_bar option that saves the frequency rate as a double, and if the value is ==0 the progress bar is disabled, and if <0 it is updated for every message. So the user has the extra option to enable it for each message if they need to, without having to set a large enough frequency rate. If it's not okay I'll change to size_t as you suggested.

Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi nicolaloi force-pushed the nicolaloi/bag-play-progress-bar branch from eb9ea35 to 9cdf5ad Compare November 10, 2024 22:56
@nicolaloi
Copy link
Contributor Author

I have realized the newly added variables progress_bar_last_time_printed_ and recv_timestamp_last_published_ in commit 43fa0f8 could have synchronization issues since the function print_progress_bar_status can be called contemporaneously from a service callback (like /play, /pause, /play_next), I need to take a better look at them.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Nov 18, 2024

@MichaelOrlov So, to avoid synchronization issues without using mutexes, etc to not increase overhead, the variable progress_bar_last_time_printed_, which tracks the last update time of the progress bar, is now accessed exclusively within the specialized function update_progress_bar_after_message_published. This function is called in code sections that are already protected by an existing mutex (skip_message_in_main_play_loop_mutex_). The progress bar updates at the specified rate only during the "standard" and recurrent event of a message being published.
For all other "extraordinary" and less common events occurring in separate threads (user triggers pause, resume, change rate, play next, etc, via keyboard controls or services) the function update_progress_bar is used. To avoid synchro issues this function does not access progress_bar_last_time_printed_. It will always print the progress bar so it appears again immediately after the new log messages; however, it does not update progress_bar_last_time_printed_. If I can use a mutex or an atomic int64, I can update progress_bar_last_time_printed_ for these "extraordinary" events as well (i.e. ensuring the progress bar timing accounts for these extraordinary updates), but I don't think is needed given they are not common.

Regarding how to print the current timestamp: when a message is published, the update_progress_bar_after_message_published function is invoked with the message's timestamp as input as suggested. However, for the "extraordinary" events (user triggers pause, resume, change rate, play next, etc, via keyboard controls or services) where the timestamp is not directly available, the saved timestamp of the latest published message (recv_timestamp_last_published_) is not used anymore, since it could have synchro issues, but clock_->now() is used. This locks the inner mutex of clock_, but these events are very rare (or may not occur at all) compared to message publishing.

Extras:

  • fixed printed timestamp when extarordinary events occur while in delayed status (used an atomic bool, but it is accessed only for these rare extraordinary events), related to Rosbag2 play --clock option ignores delay #1858 (clock_ is not paused when in delayed status);
  • added burst status [B], so in case of a large number of messages being burst, the progress bar will not be frozen at [P]aused status. It will be frozen at [B]urst status, with the [B] conveying the information that messages are currently being burst (progress bar timestamp is frozen because normal continuous updates are blocked during burst to avoid delays);
  • reduced the fixed precision of the displayed bag duration, from microseconds to milliseconds (Duration 1.862/74.434), to reduce the overall length of the progress bar string. Anyway, the displayed bag time (Time 1718553783.196209) has still microsecond precision.

@nicolaloi nicolaloi marked this pull request as ready for review December 4, 2024 23:06
@MichaelOrlov
Copy link
Contributor

@nicolaloi Could you please rebase your branch and address conflicts one more time?
We pushed another PR recently which overlaps a bit with changes.

@nicolaloi
Copy link
Contributor Author

nicolaloi commented Dec 8, 2024

@xqms thanks for the suggestions. Adding extra lines is indeed a good idea to mitigate the problem. External user's log messages shorter than the progress bar string can still be smashed, but the extra lines give more safety margin.

Note that this approach assumes that you update the UI more often than log messages appear.

Due to this, I have added an extra option, --progress-bar-separation-lines (default to 2), so for these cases the user can increase the amount of blank lines. I have implemented this in the code so it can be tested. Here's a video showing 3 external user's log messages (with the second one being a short message) being printed between two progress bar updates, testing with 4,3,2,1, and 0 separation lines:

video_example.mp4

From the video, you can see that with 3 external log messages (with one being a short message), you could encounter some issues with 2,1, and 0 separation lines.

General rule: if between two progress bar updates you are expecting N external log messages and at least one is shorter than the progress bar strings, then you should set at least N --progress-bar-separation-lines. Or, you should increase the progress bar update rate.

However, if the external log messages are longer than the progress bar strings, or there will be less than "default separation lines" (2) external log messages between two progress bar updates, then everything is ok by default. Moreover, this is an issue only if the user has his own log messages on the same output of the playback player and the progress bar is enabled.

@MichaelOrlov I have also added the separator line (=======)

Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi
Copy link
Contributor Author

friendly ping

@MichaelOrlov
Copy link
Contributor

@nicolaloi Sorry for the delay with the review.
Overall, while the functionality is working, the code quality is still far from the production level and needs to be improved/redesigned. Also lack of unit tests.
I will provide more details in a couple of days.

Copy link
Contributor

@MichaelOrlov MichaelOrlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nicolaloi. Thanks for your contributions.
The code needs to be refactored to adhere to the single responsibility principle.
The progress bar is a good candidate to be a separate self-contained class and software component with its own properties and behavior which should be covered by unit tests.

I've made such refactoring as well as many other fixes as I think make sense for progress bar implementation.
Please review and feel free to cherry-pick my changes in a separate draft PR here https://github.com/ros2/rosbag2/pull/1902/files/c4a723136795a54ce09865655fb32ce8a597d9f6..42e5134e76c2eed227e3797a660ec9340c9f2a77.
Please note that c4a7231 commit is all your commits from this PR squashed in one and rebased on top of latest master and the rest are mine.

Also please implement the missing unit test for the progress bar mentioned as TODO here 42e5134. The progress bar now can output to a provided std::ostringstream during construction. That should simplify writing unit tests to be able to deterministically verify output from it.

@nicolaloi
Copy link
Contributor Author

Yes at the very beginning, the progress bar was a separate class (not a pointer to implementation though) but then I integrated it in the player class in commit 4ba1a8c. Regarding the tests, I was waiting for a definite "progress bar style".
Thanks for the updates.

clalancette and others added 19 commits January 30, 2025 00:16
* Add test_xmllint.py to python packages.

This ensures we'll detect any invalid XML in e.g. the
package.xml

Signed-off-by: Chris Lalancette <[email protected]>

* Add in test_depend on ament_xmllint.

Signed-off-by: Chris Lalancette <[email protected]>

---------

Signed-off-by: Chris Lalancette <[email protected]>
…#1894)

* Invert start discovery condition
Repeated calls to start_discovery should not start additional topics discovery tasks without stop_discovery calls

Signed-off-by: Øystein Sture <[email protected]>

* Rename stop_discovery to discovery_running and change logic accordingly

Signed-off-by: Øystein Sture <[email protected]>

---------

Signed-off-by: Øystein Sture <[email protected]>
- Rationale: We will update progress bar with check rate from main
playback loop even in burst mode.

Signed-off-by: Michael Orlov <[email protected]>
- Rationale: To adhere single responsibility principle

Signed-off-by: Michael Orlov <[email protected]>
- Also added appropriate Doxygen comments for public API

Signed-off-by: Michael Orlov <[email protected]>
- Rationale: To be able to redirect progress bar output to the
 std::ostringstream in unit tests.

Signed-off-by: Michael Orlov <[email protected]>
- The progress_bar_update_period_ms_ equal to 0 means update always.
- Also use integer arithmetic for the sake of performance.

Signed-off-by: Michael Orlov <[email protected]>
Signed-off-by: Michael Orlov <[email protected]>
Signed-off-by: Nicola Loi <[email protected]>
@nicolaloi
Copy link
Contributor Author

  • I wrote the unit tests using loose/short regex to avoid multiple repetitions of "=== progress bar ===" or "\x1B[2K\n\x1B[2K..." in half of the tests. These substrings are specifically tested in a few tests only. Is this okay, or should the regex be more strict?
  • Could you clarify the target string being tested in update_with_limited_rate_with_zero_timestamp? Also, what should be the difference in the tested string between update_with_limited_rate_update_progress and eg update_with_limited_rate_with_negative_update_rate?
  • The DCO test is failing because I fast-forwarded your changes with a merge to preserve authorship, but it looks like I should have added my sign-off to the commits. Should I force-push after running git rebase HEAD~X --signoff to keep these commits, or should I force-push a single squashed commit instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add progress bar and timestamp when playing back a bag
7 participants