@@ -45,12 +45,13 @@ Updated on 2023.05.25 DI-engine-v0.4.8
45
45
46
46
It provides ** python-first** and ** asynchronous-native** task and middleware abstractions, and modularly integrates several of the most important decision-making concepts: Env, Policy and Model. Based on the above mechanisms, DI-engine supports ** various [ deep reinforcement learning] ( https://di-engine-docs.readthedocs.io/en/latest/10_concepts/index.html ) algorithms** with superior performance, high efficiency, well-organized [ documentation] ( https://di-engine-docs.readthedocs.io/en/latest/ ) and [ unittest] ( https://github.com/opendilab/DI-engine/actions ) :
47
47
48
- - Most basic DRL algorithms, such as DQN, PPO, SAC, R2D2, IMPALA
49
- - Multi-agent RL algorithms like QMIX, MAPPO, ACE
50
- - Imitation learning algorithms (BC/IRL/GAIL) , such as GAIL, SQIL, Guided Cost Learning, Implicit BC
51
- - Offline RL algorithms: CQL, TD3BC, Decision Transformer
52
- - Model-based RL algorithms: SVG, MVE, STEVE / MBPO, DDPPO
53
- - Exploration algorithms like HER, RND, ICM, NGU
48
+ - Most basic DRL algorithms: such as DQN, Rainbow, PPO, TD3, SAC, R2D2, IMPALA
49
+ - Multi-agent RL algorithms: such as QMIX, WQMIX, MAPPO, HAPPO, ACE
50
+ - Imitation learning algorithms (BC/IRL/GAIL): such as GAIL, SQIL, Guided Cost Learning, Implicit BC
51
+ - Offline RL algorithms: BCQ, CQL, TD3BC, Decision Transformer, EDAC
52
+ - Model-based RL algorithms: SVG, STEVE, MBPO, DDPPO, DreamerV3, MuZero
53
+ - Exploration algorithms: HER, RND, ICM, NGU
54
+ - Other algorithims: such as PER, PLR, PCGrad
54
55
55
56
** DI-engine** aims to ** standardize different Decision Intelligence environments and applications** , supporting both academic research and prototype applications. Various training pipelines and customized decision AI applications are also supported:
56
57
@@ -64,14 +65,16 @@ It provides **python-first** and **asynchronous-native** task and middleware abs
64
65
- Real world decision AI applications
65
66
- [ DI-star] ( https://github.com/opendilab/DI-star ) : Decision AI in StarCraftII
66
67
- [ DI-drive] ( https://github.com/opendilab/DI-drive ) : Auto-driving platform
67
- - [ GoBigger] ( https://github.com/opendilab/GoBigger ) : [ ICLR 2023] Multi-Agent Decision Intelligence Environment
68
68
- [ DI-sheep] ( https://github.com/opendilab/DI-sheep ) : Decision AI in 3 Tiles Game
69
69
- [ DI-smartcross] ( https://github.com/opendilab/DI-smartcross ) : Decision AI in Traffic Light Control
70
70
- [ DI-bioseq] ( https://github.com/opendilab/DI-bioseq ) : Decision AI in Biological Sequence Prediction and Searching
71
71
- [ DI-1024] ( https://github.com/opendilab/DI-1024 ) : Deep Reinforcement Learning + 1024 Game
72
72
- Research paper
73
73
- [ InterFuser] ( https://github.com/opendilab/InterFuser ) : [ CoRL 2022] Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
74
74
- [ ACE] ( https://github.com/opendilab/ACE ) : [ AAAI 2023] ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency
75
+ - [ GoBigger] ( https://github.com/opendilab/GoBigger ) : [ ICLR 2023] Multi-Agent Decision Intelligence Environment
76
+ - [ DOS] ( https://github.com/opendilab/DOS ) : [ CVPR 2023] ReasonNet: End-to-End Driving with Temporal and Global Reasoning
77
+ - [ LightZero] ( https://github.com/opendilab/LightZero ) : LightZero: A lightweight and efficient MCTS/AlphaZero/MuZero algorithm toolkit
75
78
- Docs and Tutorials
76
79
- [ DI-engine-docs] ( https://github.com/opendilab/DI-engine-docs ) : Tutorials, best practice and the API reference.
77
80
- [ awesome-model-based-RL] ( https://github.com/opendilab/awesome-model-based-RL ) : A curated list of awesome Model-Based RL resources
@@ -245,7 +248,7 @@ P.S: The `.py` file in `Runnable Demo` can be found in `dizoo`
245
248
| 40 | [ ICM] ( https://arxiv.org/pdf/1705.05363.pdf ) | ![ exp] ( https://img.shields.io/badge/-exploration-orange ) | [ ICM doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/icm.html ) <br >[ ICM中文文档] ( https://di-engine-docs.readthedocs.io/zh_CN/latest/12_policies/icm_zh.html ) <br >[ reward_model/icm] ( https://github.com/opendilab/DI-engine/blob/main/ding/reward_model/icm_reward_model.py ) | python3 -u cartpole_ppo_icm_config.py |
246
249
| 41 | [ CQL] ( https://arxiv.org/pdf/2006.04779.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ CQL doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/cql.html ) <br >[ policy/cql] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/cql.py ) | python3 -u d4rl_cql_main.py |
247
250
| 42 | [ TD3BC] ( https://arxiv.org/pdf/2106.06860.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ TD3BC doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/td3_bc.html ) <br >[ policy/td3_bc] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/td3_bc.py ) | python3 -u d4rl_td3_bc_main.py |
248
- | 43 | [ Decision Transformer] ( https://arxiv.org/pdf/2106.01345.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ policy/dt] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/decision_transformer .py ) | python3 -u d4rl_dt_main .py |
251
+ | 43 | [ Decision Transformer] ( https://arxiv.org/pdf/2106.01345.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ policy/dt] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/dt .py ) | python3 -u ding/example/dt .py |
249
252
| 44 | [ EDAC] ( https://arxiv.org/pdf/2110.01548.pdf ) | ![ offline] ( https://img.shields.io/badge/-offlineRL-darkblue ) | [ EDAC doc] ( https://di-engine-docs.readthedocs.io/en/latest/12_policies/edac.html ) <br >[ policy/edac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/edac.py ) | python3 -u d4rl_edac_main.py |
250
253
| 45 | MBSAC([ SAC] ( https://arxiv.org/abs/1801.01290 ) +[ MVE] ( https://arxiv.org/abs/1803.00101 ) +[ SVG] ( https://arxiv.org/abs/1510.09142 ) ) | ![ continuous] ( https://img.shields.io/badge/-continous-green ) ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ policy/mbpolicy/mbsac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/mbpolicy/mbsac.py ) | python3 -u pendulum_mbsac_mbpo_config.py \ python3 -u pendulum_mbsac_ddppo_config.py |
251
254
| 46 | STEVESAC([ SAC] ( https://arxiv.org/abs/1801.01290 ) +[ STEVE] ( https://arxiv.org/abs/1807.01675 ) +[ SVG] ( https://arxiv.org/abs/1510.09142 ) ) | ![ continuous] ( https://img.shields.io/badge/-continous-green ) ![ mbrl] ( https://img.shields.io/badge/-ModelBasedRL-lightblue ) | [ policy/mbpolicy/mbsac] ( https://github.com/opendilab/DI-engine/blob/main/ding/policy/mbpolicy/mbsac.py ) | python3 -u pendulum_stevesac_mbpo_config.py |
@@ -433,9 +436,9 @@ DI-engine utilizes [TreeTensor](https://github.com/opendilab/DI-treetensor) as t
433
436
- [File an issue](https:// github.com/ opendilab/ DI - engine/ issues/ new/ choose) on Github
434
437
- Open or participate in our [forum](https:// github.com/ opendilab/ DI - engine/ discussions)
435
438
- Discuss on DI - engine [slack communication channel](https:// join.slack.com/ t/ opendilab/ shared_invite/ zt- v9tmv4fp- nUBAQEH1_Kuyu_q4plBssQ)
436
- - Discuss on DI - engine' s QQ group (700157520) or add us on WeChat
439
+ - Discuss on DI - engine' s WeChat group (i.e. add us on WeChat: ding314assist)
437
440
438
- 
441
+ 
439
442
- Contact our email (opendilab@ pjlab.org.cn)
440
443
- Contributes to our future plan [Roadmap](https:// github.com/ opendilab/ DI - engine/ issues/ 548 )
441
444
0 commit comments