-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Candidate Principle: CommonRLInterface does NOT provide any default implementations! #6
Comments
To make things easier, we could provide a |
Now I'm fully convinced by your reasons. 👍 |
I agree with your reasons completely, but let me add to (1) more. Environments that don't implement optional interfaces should still work with solvers/algorithms that rely on them as much as possible. Asking the environment writer to provide defaults is work (even if it is easy), and asking the algorithm writer to make fallback functions to dispatch on is also work. The point of the package is to make the entire RL ecosystem work better together. I think defaults are a key part of that. For all the reasons you explained, I don't think we should provide defaults. But I think there are good alternatives:
These approaches avoid the miscommunication mentioned, because they still force the environment writer to put down code making a decision whether to use a default. But the alleviate a lot of burden, because once I accept that I want a default, then I just need to know For tinkerers (who neither write the algorithm nor the environment), it also allows them to provide a default easily when an environment writer doesn't. It may fail, but they know what caused it, and it is an easy way to get code up and running. |
Ok, let's consider this principle adopted. #22 will discuss the design of an opt-in default system |
I propose that we make a principled decision to not provide default implementations for any optional interface functions in this package.
There are two types of default implementations:
clone(env) = deepcopy(env)
validactions(env) = actions(env)[actionmask(env)]
I see two reasons for having default implementations
I think it is wiser to make it a principle that we do not provide either type of default implementation for interface functions. Here is my reasoning:
@provide clone(m::MyEnv) = deepcopy(env)
, especially if that default is suggested. The more important thing is communication - it is clear to everyone that the algorithm is using that part of the interface and the environment writer consciously implemented it. On average, it might reduce the workload slightly to have default implementations, but in some cases, it will increase the workload hugely if there is confusion. We want to minimize confusion more than maximizing convenience.validactions(env) = actions(env)[actionmask(env)]
. Then you might get aMethodError
foractionmask
. This would be very hard for you to think about, and you might even conclude that you can't use the algorithm because you need to implementactionmask
and you don't know how to do that for continuous spaces.What do other people think? I am definitely open to debate on this
The text was updated successfully, but these errors were encountered: