-
Notifications
You must be signed in to change notification settings - Fork 207
Open
Description
RL Theory is not properly represented. A new section should be added, with at least:
- Tabular setting
- With a generative model
- QVI
- Without
- UCRL2
- UCBVI
- Episodic
- Q-learning+UCB
- With a generative model
- Extensions to compact state-action spaces
- Extension to Kernels
- Performance measures: PAC, simple regret, cumulative regret, etc.
- RL with compatible function approximation
Is there a difference between generative models (sample any transition) and simulators (simulate trajectories from current states only)?
Activity
Add a Theory section
Add sample complexity of RL with generative model
Add UCBVI, improving over UCRL2
Add a paper on different RL theory settings and possible conversions
Add LSVI with UCB for linear mdps