@@ -246,16 +246,17 @@ P.S: The `.py` file in `Runnable Demo` can be found in `dizoo`
246
246
| 41 | [ CQL] ( https://arxiv.org/pdf/2006.04779.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ CQL doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/cql.html ) <br >[ policy/cql] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/cql.py ) | python3 -u d4rl_cql_main.py |
247
247
| 42 | [ TD3BC] ( https://arxiv.org/pdf/2106.06860.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ TD3BC doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/td3_bc.html ) <br >[ policy/td3_bc] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/td3_bc.py ) | python3 -u d4rl_td3_bc_main.py |
248
248
| 43 | [ Decision Transformer] ( https://arxiv.org/pdf/2106.01345.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ policy/dt] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/decision_transformer.py ) | python3 -u d4rl_dt_main.py |
249
- | 44 | MBSAC([ SAC] ( https://arxiv.org/abs/1801.01290 ) +[ MVE] ( https://arxiv.org/abs/1803.00101 ) +[ SVG] ( https://arxiv.org/abs/1510.09142 ) ) | ![ continuous] ( https://img.shields.io/badge/-continous-green ) ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ policy/mbpolicy/mbsac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/mbpolicy/mbsac.py ) | python3 -u pendulum_mbsac_mbpo_config.py \ python3 -u pendulum_mbsac_ddppo_config.py |
250
- | 45 | STEVESAC([ SAC] ( https://arxiv.org/abs/1801.01290 ) +[ STEVE] ( https://arxiv.org/abs/1807.01675 ) +[ SVG] ( https://arxiv.org/abs/1510.09142 ) ) | ![ continuous] ( https://img.shields.io/badge/-continous-green ) ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ policy/mbpolicy/mbsac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/mbpolicy/mbsac.py ) | python3 -u pendulum_stevesac_mbpo_config.py |
251
- | 46 | [ MBPO] ( https://arxiv.org/pdf/1906.08253.pdf ) | ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ MBPO doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/mbpo.html ) <br >[ world_model/mbpo] ( https://github.com/opendilab/DI-engine/blob/main/ding/world_model/mbpo.py ) | python3 -u pendulum_sac_mbpo_config.py |
252
- | 47 | [ DDPPO] ( https://openreview.net/forum?id=rzvOQrnclO0 ) | ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ world_model/ddppo] ( https://github.com/opendilab/DI-engine/blob/main/ding/world_model/ddppo.py ) | python3 -u pendulum_mbsac_ddppo_config.py |
253
- | 48 | [ PER] ( https://arxiv.org/pdf/1511.05952.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ worker/replay_buffer] ( https://github.com/opendilab/DI-engine/blob/main/ding/worker/replay_buffer/advanced_buffer.py ) | ` rainbow demo ` |
254
- | 49 | [ GAE] ( https://arxiv.org/pdf/1506.02438.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ rl_utils/gae] ( https://github.com/opendilab/DI-engine/blob/main/ding/rl_utils/gae.py ) | ` ppo demo ` |
255
- | 50 | [ ST-DIM] ( https://arxiv.org/pdf/1906.08226.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ torch_utils/loss/contrastive_loss] ( https://github.com/opendilab/DI-engine/blob/main/ding/torch_utils/loss/contrastive_loss.py ) | ding -m serial -c cartpole_dqn_stdim_config.py -s 0 |
256
- | 51 | [ PLR] ( https://arxiv.org/pdf/2010.03934.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ PLR doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/plr.html ) <br >[ data/level_replay/level_sampler] ( https://github.com/opendilab/DI-engine/blob/main/ding/data/level_replay/level_sampler.py ) | python3 -u bigfish_plr_config.py -s 0 |
257
- | 52 | [ PCGrad] ( https://arxiv.org/pdf/2001.06782.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ torch_utils/optimizer_helper/PCGrad] ( https://github.com/opendilab/DI-engine/blob/main/ding/data/torch_utils/optimizer_helper.py ) | python3 -u multi_mnist_pcgrad_main.py -s 0 |
258
- | 53 | [ edac] ( https://arxiv.org/pdf/2110.01548.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ EDAC doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/edac.html ) <br >[ policy/edac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/edac.py ) | python3 -u d4rl_edac_main.py |
249
+ | 44 | [ EDAC] ( https://arxiv.org/pdf/2110.01548.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ EDAC doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/edac.html ) <br >[ policy/edac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/edac.py ) | python3 -u d4rl_edac_main.py |
250
+ | 45 | MBSAC([ SAC] ( https://arxiv.org/abs/1801.01290 ) +[ MVE] ( https://arxiv.org/abs/1803.00101 ) +[ SVG] ( https://arxiv.org/abs/1510.09142 ) ) | ![ continuous] ( https://img.shields.io/badge/-continous-green ) ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ policy/mbpolicy/mbsac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/mbpolicy/mbsac.py ) | python3 -u pendulum_mbsac_mbpo_config.py \ python3 -u pendulum_mbsac_ddppo_config.py |
251
+ | 46 | STEVESAC([ SAC] ( https://arxiv.org/abs/1801.01290 ) +[ STEVE] ( https://arxiv.org/abs/1807.01675 ) +[ SVG] ( https://arxiv.org/abs/1510.09142 ) ) | ![ continuous] ( https://img.shields.io/badge/-continous-green ) ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ policy/mbpolicy/mbsac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/mbpolicy/mbsac.py ) | python3 -u pendulum_stevesac_mbpo_config.py |
252
+ | 47 | [ MBPO] ( https://arxiv.org/pdf/1906.08253.pdf ) | ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ MBPO doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/mbpo.html ) <br >[ world_model/mbpo] ( https://github.com/opendilab/DI-engine/blob/main/ding/world_model/mbpo.py ) | python3 -u pendulum_sac_mbpo_config.py |
253
+ | 48 | [ DDPPO] ( https://openreview.net/forum?id=rzvOQrnclO0 ) | ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ world_model/ddppo] ( https://github.com/opendilab/DI-engine/blob/main/ding/world_model/ddppo.py ) | python3 -u pendulum_mbsac_ddppo_config.py |
254
+ | 49 | [ DreamerV3] ( https://arxiv.org/pdf/2301.04104.pdf ) | ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ world_model/dreamerv3] ( https://github.com/opendilab/DI-engine/blob/main/ding/world_model/dreamerv3.py ) | python3 -u cartpole_balance_dreamer_config.py |
255
+ | 50 | [ PER] ( https://arxiv.org/pdf/1511.05952.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ worker/replay_buffer] ( https://github.com/opendilab/DI-engine/blob/main/ding/worker/replay_buffer/advanced_buffer.py ) | ` rainbow demo ` |
256
+ | 51 | [ GAE] ( https://arxiv.org/pdf/1506.02438.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ rl_utils/gae] ( https://github.com/opendilab/DI-engine/blob/main/ding/rl_utils/gae.py ) | ` ppo demo ` |
257
+ | 52 | [ ST-DIM] ( https://arxiv.org/pdf/1906.08226.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ torch_utils/loss/contrastive_loss] ( https://github.com/opendilab/DI-engine/blob/main/ding/torch_utils/loss/contrastive_loss.py ) | ding -m serial -c cartpole_dqn_stdim_config.py -s 0 |
258
+ | 53 | [ PLR] ( https://arxiv.org/pdf/2010.03934.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ PLR doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/plr.html ) <br >[ data/level_replay/level_sampler] ( https://github.com/opendilab/DI-engine/blob/main/ding/data/level_replay/level_sampler.py ) | python3 -u bigfish_plr_config.py -s 0 |
259
+ | 54 | [ PCGrad] ( https://arxiv.org/pdf/2001.06782.pdf ) | ![ other] ( https://img.shields.io/badge/-other-lightgrey ) | [ torch_utils/optimizer_helper/PCGrad] ( https://github.com/opendilab/DI-engine/blob/main/ding/data/torch_utils/optimizer_helper.py ) | python3 -u multi_mnist_pcgrad_main.py -s 0 |
259
260
</details >
260
261
261
262
0 commit comments