-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[RLlib] Create resource bundle per learner #59620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Kamil Kaczmarek <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the creation of resource bundles for learners in RLlib. Instead of creating a single large bundle for all learners, it now creates a separate bundle for each learner. This is a good change that allows for more flexible scheduling of learners across a cluster. The removal of the extension point for _get_learner_bundles also simplifies the code.
My review includes one suggestion to apply the same bundling logic to aggregator actors when no remote learners are used, for consistency.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Kamil Kaczmarek <[email protected]>
simonsays1980
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the ownership here @kamil-kaczmarek ! The solution you propose looks good. There is one question left to be answered.
| "CPU": num_cpus_per_learner + config.num_aggregator_actors_per_learner, | ||
| "GPU": config.num_gpus_per_learner, | ||
| } | ||
| for _ in range(config.num_learners) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks more correct than before, but I am wondering, if this would still not ensure that AggregatorActors being scheduled on the same node as we do not use placement groups. Could theoretically an EnvRunner be scheduled on CPUs of the same node instead of an AggregatorActor?
Signed-off-by: Mark Towers <[email protected]>
… for spotting) Signed-off-by: Mark Towers <[email protected]>
simonsays1980
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for this important change @kamil-kaczmarek !
Description
Create a resource bundle for each learner, do not pack all learners into single bundle.
Related to #51017