Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] - Add timers to env step, forward pass, and complete connector pipelines runs. #51160

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

simonsays1980
Copy link
Collaborator

Why are these changes needed?

Momentarily, we time solely the complete sampling process as well as single connector pieces inside of the connector pipelines. Important measurements including the inference/exploration forward pass of the RLModule, the times for an environment step, or the complete time for a run of the EnvToModulePipeline and ModuleToEnvPipeline are not timed. This PR proposes necessary changes to include these measurements into our metrics.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@simonsays1980 simonsays1980 marked this pull request as ready for review March 7, 2025 13:30
@simonsays1980 simonsays1980 requested a review from sven1977 as a code owner March 7, 2025 13:30
@edoakes edoakes added the go add ONLY when ready to merge, run all tests label Mar 7, 2025
Copy link
Contributor

@ArturNiederfahrenhorst ArturNiederfahrenhorst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice changes! High level feedback before I approve:
We should move the timer context managers inside of the connector pipeline calls.
Same goes for _try_env_step! Let's please move the timer in there.

@@ -320,7 +327,8 @@ def _sample(
actions = to_env.pop(Columns.ACTIONS)
actions_for_env = to_env.pop(Columns.ACTIONS_FOR_ENV, actions)
# Try stepping the environment.
results = self._try_env_step(actions_for_env)
with self.metrics.log_time(ENV_STEP_TIMER):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics context can go inside _try_env_step so we don't have to duplicate the call to it for single- and multi-agent env runners.

shared_data=shared_data,
metrics=self.metrics,
)
with self.metrics.log_time(MODULE_TO_ENV_TIMER):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We time all connector calls of each ConnectorPipeline.
That's why we pass the metrics object.
We should also encapsulate the timing of the overall connector pipeline call in there so that we don't duplicate this code everywhere (deduplication will make this easier to maintain)

Each connectorpipeline can own its name for logging purposes and we can also take this opportunity to prepend a meaningful prefix to each connector logging entry.

For example log the ClipReward time could be logged as ".....env_to_agent_connector_pipeline.clip_reward" and the overall env to agent pipeline could be logged as "....env_to_agent_connector_pipeline" or something like this.

ENV_STEP_TIMER = "env_step_timer"
ENV_TO_MODULE_TIMER = "env_to_module_timer"
RLMODULE_INFERENCE_TIMER = "rlmodule_inference_timer"
MODULE_TO_ENV_TIMER = "module_to_env_timer"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should take this opportunity to make the names of our metrics more expressive.
Metrics should self-document as much as possible for users who do not have a good overview of RLlib.
"env_step_timer" is good imo. "env_to_module_timer" should be "env_to_module_connector_pipeline_timer" or something similar. Same goes for the others 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file are nice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants