3D connectX repository, developed for the OpenAI Gym format.
The preferred installation of 3d-connectX-env
is from pip
:
pip install 3d-connectX-env
import gym_3d_connectX
import gym
env = gym.make('3d-connectX-v0')
env.reset()
env.utils.win_reward = 100
env.utils.draw_penalty = 50
env.utils.lose_penalty = 100
env.utils.could_locate_reward = 10
env.utils.couldnt_locate_penalty = 10
env.utils.time_penalty = 1
env.player = 1
actions = [0, 0, 1, 1, 2, 2, 4, 4, 0, 0, 1, 1, 2, 2, 0, 3]
for action in actions:
obs, reward, done, info = env.step(action)
env.render(mode="plot")
The environments only send reward-able game-play frames to agents; No cut-scenes, loading screens, etc. are sent to an agent nor can an agent perform actions during these instances.
Environment: 3d-connectX-v0
Key | Type | Description |
---|---|---|
num_grid |
int |
Length of a side. |
num_win_seq |
int |
The number of sequence necessary for winning. |
win_reward |
float |
The reward agent gets when win the game. |
draw_penalty |
float |
The penalty agent gets when it draw the game. |
lose_penalty |
float |
The penalty agent gets when it lose the game. |
couldnt_locate_penalty |
float |
The penalty agent gets when it choose the location where the stone cannot be placed. |
could_locate_reward |
float |
The additional reward for agent being able to put the stone. |
time_penalty |
float |
The penalty agents gets along with timesteps. |
first_player |
int |
Define which is the first player. |
Info about the rewards and info returned by the step
method.
Key | Type | Description |
---|---|---|
turn |
int |
The number of the player at this step |
winner |
int |
Value of the player on the winning side |
is_couldnt_locate |
bool |
In this step the player chooses where to place the stone. |