RL Theory

RL Theory is not properly represented. A new section should be added, with at least:

* Tabular setting
  * With a generative model
    * QVI
  * Without
    * UCRL2
    * UCBVI
   * Episodic
    * Q-learning+UCB
* Extensions to compact state-action spaces
* Extension to Kernels
* Performance measures: PAC, simple regret, cumulative regret, etc.
* RL with compatible function approximation

Is there a difference between generative models (sample any transition) and simulators (simulate trajectories from current states only)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RL Theory #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

RL Theory #1

Description

Activity

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Issue actions