Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Candidate Principle: CommonRLInterface does NOT provide any default implementations! #6

Closed
zsunberg opened this issue Jun 16, 2020 · 4 comments
Milestone

Comments

@zsunberg
Copy link
Member

zsunberg commented Jun 16, 2020

I propose that we make a principled decision to not provide default implementations for any optional interface functions in this package.

There are two types of default implementations:

  1. Implementations in terms of core julia functions, e.g. clone(env) = deepcopy(env)
  2. Implementations in terms of other interface functions, e.g. validactions(env) = actions(env)[actionmask(env)]

I see two reasons for having default implementations

  1. It reduces the workload on the environment implementer
  2. It makes more of the interface available for algorithms

I think it is wiser to make it a principle that we do not provide either type of default implementation for interface functions. Here is my reasoning:

  1. I don't think reason (1) is very compelling. It is not very inconvenient to to add the one line @provide clone(m::MyEnv) = deepcopy(env), especially if that default is suggested. The more important thing is communication - it is clear to everyone that the algorithm is using that part of the interface and the environment writer consciously implemented it. On average, it might reduce the workload slightly to have default implementations, but in some cases, it will increase the workload hugely if there is confusion. We want to minimize confusion more than maximizing convenience.
  2. Error messages get very strange very quickly, and it is not always clear which defaults to implement. For example, suppose you have a continuous action space and there is a default validactions(env) = actions(env)[actionmask(env)]. Then you might get a MethodError for actionmask. This would be very hard for you to think about, and you might even conclude that you can't use the algorithm because you need to implement actionmask and you don't know how to do that for continuous spaces.

What do other people think? I am definitely open to debate on this

@zsunberg
Copy link
Member Author

To make things easier, we could provide a suggest_default mechanism. E.g. suggest_default(clone, env) could print @provide clone(m::MyEnv) = deepcopy(env).

@zsunberg zsunberg added the decision A decision that needs to be made label Jun 16, 2020
@zsunberg zsunberg added this to the v0.2 milestone Jun 16, 2020
@findmyway
Copy link
Member

Now I'm fully convinced by your reasons. 👍

@darsnack
Copy link

I agree with your reasons completely, but let me add to (1) more. Environments that don't implement optional interfaces should still work with solvers/algorithms that rely on them as much as possible. Asking the environment writer to provide defaults is work (even if it is easy), and asking the algorithm writer to make fallback functions to dispatch on is also work. The point of the package is to make the entire RL ecosystem work better together. I think defaults are a key part of that.

For all the reasons you explained, I don't think we should provide defaults. But I think there are good alternatives:

  • Have a non-exported submodule of defaults, and make something like @default which calls the default function from the submodule
  • Have a separate package that does the same

These approaches avoid the miscommunication mentioned, because they still force the environment writer to put down code making a decision whether to use a default. But the alleviate a lot of burden, because once I accept that I want a default, then I just need to know @default.

For tinkerers (who neither write the algorithm nor the environment), it also allows them to provide a default easily when an environment writer doesn't. It may fail, but they know what caused it, and it is an easy way to get code up and running.

@zsunberg zsunberg added decision-made and removed decision A decision that needs to be made labels Jun 30, 2020
@zsunberg
Copy link
Member Author

Ok, let's consider this principle adopted. #22 will discuss the design of an opt-in default system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants