WinTacToe is a system for training reinforcement learning agents to play tic-tac-toe-related games. It consists of the following elements:
- Multiple, diverse game environments
- RL agents of various architectures and levels of advancement
- Human player interface which allow to play game with the chosen agent(s)
Futhermore WinTacToe allows to analyse agents
# Engine
from environments.tic_tac_toe.tic_tac_toe_engine import TicTacToeEngine
#Agents
from reinforcement_learning.agents.base_agent import BaseAgent
from reinforcement_learning.agents.n_step_agent.n_step_agent import NStepAgent
from reinforcement_learning.agents.random_agent.random_agent import RandomAgent
from reinforcement_learning.agents.dqn_agent.dqn_agent import DQNAgent
# Agents building blocks
from reinforcement_learning.agents.common_building_blocks.epsilon_strategy import ConstantEpsilonStrategy, CircleEpsilonStrategy, DecayingSinusEpsilonStrategy
# Training
from reinforcement_learning.simple_training import SimpleTraining
# Agents Database
from reinforcement_learning.agents_database.agents_db import AgentsDB
# To avoid warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
To start a training you need an engine:
engine = TicTacToeEngine(2, 3, 3)
...and agents, so you can create them:
agents = [NStepAgent(n=5,
step_size=0.1,
epsilon_strategy=CircleEpsilonStrategy(starting_epsilon_value=0.1, exploration_part=0.7),
discount=1),
DQNAgent(step_size=0.01,
discount=1,
epsilon_strategy=DecayingSinusEpsilonStrategy(starting_epsilon_value=0.1, exploration_part=0.7),
fit_period=64,
batch_size=64,
max_memory_size=64)]
Agents can be manually saved to files:
agent_0_file_path = os.path.join(ABS_PROJECT_ROOT_PATH, "test_reinforcement_learning", "common_building_blocks", "trained_agents", "agent0.rl_agent")
agent_1_file_path = os.path.join(ABS_PROJECT_ROOT_PATH, "test_reinforcement_learning", "common_building_blocks", "trained_agents", "agent1.rl_agent")
agents_file_paths = [agent_0_file_path, agent_1_file_path]
[agent.save(agent_file_path) for (agent, agent_file_path) in zip(agents, agents_file_paths)]
...and can be loaded from files:
agents = [BaseAgent.load(agent_file_path) for agent_file_path in agents_file_paths]
You can also save them to the agents database:
Firstly call AgentsDB.setup() if you want to work on the new agents database
AgentsDB.setup()
Secondly save your agents in database
[AgentsDB.save(agent=agents[i],
player=i,
board_size=3,
marks_required=3,
description="Non-trained agent") for i in range(len(agents))]
Here is how agents can be loaded from agents database: AgentsDB.load method can take subset of these parameters (no parameters is also allowed):
- id : int
- class_name : str
- player : int
- board_size : int
- marks_required : int
- description : str
- agent : BaseAgent
- savetime : Datetime
- And it returns a list of agents objects that satisfy given parameters
agent0 = AgentsDB.load(class_name="NStepAgent", player=0)[0]
agent1 = AgentsDB.load(class_name="DQNAgent", player=1)[0]
agents = [agent0, agent1]
Database can be reset with AgentsDB.reset()
AgentsDB.reset()
For more sophisticated use of agents database you can use AgentsDB.command method
AgentsDB.command("SELECT * from AGENTS")
You can also connect any other db client (e.g. DataGrip) to the agents database
Training is as simple as this:
episodes_no = 1000
with SimpleTraining(engine=engine, agents=agents) as st:
# assignment is necessary, because training doesn't modify agents provided in constructor
agents = st.train(episodes_no=episodes_no,
auto_saving=100,
saving_description=["Trained agent", "Trained agent"])
SimpleTraining train method parameters description
- episodes_no: number of episodes to play
- auto_saving:
if None or False-> nothing is saved:
- if True -> All agents are saved at the and of the training
- if Integer -> All agents are saved every auto_saving episodes
- saving_description:
- if None -> All agents are saved with the same standard description: "{episodes_no} episodes"
- if string -> This string is used as description for saving for all agents
- if list(string) -> Each agent is saved with his own description from list
Agents can be visualized
[agent.visualize() for agent in agents]
Make sure that you have following configuration of traing platform config.ini:
[TRAINING PLATFORM PARAMETERS]
actorsystembase = simpleSystemBase
logging = 0
For now, an opponent is loaded as last agent that satisfy mark, board size and marks required.
If you want load specific agent from database in TicTacToeComponent in # Oponnent joing
section change
matching_agents = AgentsDB.load(player=self._opponent_mark,
board_size=self._board_size,
marks_required=self.marks_required) # List of all agents that satisfy criteria
# TODO: Make convenient select agent function (maybe in GUI)
# For now it's just last agent
def agent_select(agents):
if not agents:
raise ValueError("There are now agents satisfying given criteria in the Agents Database")
return agents[-1]
agent = agent_select(matching_agents)
to
matching_agents = AgentsDB.load(arbitral_subset_of_criterias)
def agent_select(agents):
if not agents:
raise ValueError("There are now agents satisfying given criteria in the Agents Database")
# Some your agent choosing implementation
agent = agent_select(matching_agents)
or if you whant use more sophisticated sql query
rows = AgentsDB.command(sql_query_string)
def some_rows_postprocessing(rows):
# Some your implementation of rows postprocessing
agent = some_rows_postprocessing(rows)
Another way is to load agent from file:
agent = BaseAgent.load(agent_file_path)
To start a game run launch_application.py typing in console started in project root:
python game_app/launch_application.py