Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving Collaboration: Separate out the environment interface #954

Open
zsunberg opened this issue Aug 9, 2023 · 4 comments
Open

Improving Collaboration: Separate out the environment interface #954

zsunberg opened this issue Aug 9, 2023 · 4 comments
Labels
Milestone

Comments

@zsunberg
Copy link
Member

zsunberg commented Aug 9, 2023

Hi everyone,

It has been cool to see the recent flurry of contributions to this package, especially by @jeremiahpslewis. In a recent discussion, someone asked what would facilitate cooperation between the POMDPs.jl and JuliaRL communities. I was thinking about this a bit more and came to the conclusion:

Separating out the environment interface would be the most helpful change for expanding collaboration.

There are a few reasons for this:

  1. There are many different reasons for writing RL algorithms. I assign homeworks where students write RL algorithms ranging from tabular SARSA to DQN or policy gradient; someone else might want a single very-high-performance PPO to reliably deploy to a web service; another person might want a library of research-quality algorithms to compare to; another person might want a CleanRL-style set of implementations that maximize readability. These should not all be in the same package, but they should use the same environment interface.
  2. Since this environment interface will have many stakeholders, there must be a way for all of the stakeholders to monitor and weigh in on interface design decisions. Currently, any discussion about the environment interface will be also be mixed in with discussion about GPUs, hooks, etc.
  3. Let's say that I write a package that uses the environment interface in RLCore.jl, but I don't want to use the policy interface. If I say I use RLCore.jl, it is unclear if I am committing to using just the environment interface or also the policy interface, and if a user wants to write an environment, they will find the RL.jl documentation and could be very distracted by all of the information about experiments, agents, etc, which my package does not use.
  4. It would be easier to understand the environment interface if it and its documentation was separated from the RL.jl documentation. (though the current environment interface documentation has improved a lot already!)
  5. In the successful Python RL ecosystem, the environment interface in gym/gymnasium/pettingzoo is separated from the packages that implement learning agents.

If the environment is separated out (and is sufficiently flexible), I would probably convert some important packages like MCTS and POMCP to use it. Then, they could be much more compatible with RL.jl.

A final note: In principle, CommonRLInterface could be a candidate for a separated-out environment interface, but I do not think it can be successful unless RL.jl chooses to use it directly. To be clear, I would vigorously advocate for this, and I am happy to discuss why, but I recognize that this would be biased since I wrote most of that package.

@jeremiahpslewis
Copy link
Member

Thanks for kicking off this discussion! What do you mean by use CommonRL directly? I've been wondering for some time whether we should use consistent naming with CommonRL. I haven't gone through every method in CommonRL, so I'm not willing to commit to the exact naming 100%, but I would love to converge on one set of terms & apis. My thought would be that we first do this for the methods which are already included in CommonRL, then in a second step look into env's. Thoughts? @HenriDeh

@jeremiahpslewis
Copy link
Member

jeremiahpslewis commented Aug 9, 2023

Concretely, I mean things like

CRL.valid_actions(x::CommonRLEnv) = legal_action_space(x.env)
where I'd be happy to use valid_actions and drop legal_action_space

@zsunberg
Copy link
Member Author

zsunberg commented Aug 9, 2023

What do you mean by use CommonRL directly?

By this option, I mean completely deprecating and then removing RLCore.AbstractEnv and using CommonRL.AbstractEnv and the methods from CommonRL everywhere within RL.jl. This would be a big change, and I don't understand all the consequences yet.

@HenriDeh
Copy link
Member

Thoughts? @HenriDeh

I am so in favor of this. I don't think it would be that overwhelming of a change. Deprecating first, then dropping is a good idea because many algorithms are not tested at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants