An implementation of AlphaZero for the board game Tak. See also https://github.com/ViliamVadocz/tak
The repository contains several libraries and binaries:
takzero
is the main library which implements MCTS and the neural networksselfplay
is used during training to generate replays and exploitation targetsreanalyze
computes fresh targets from old replayslearn
takes targets fromselfplay
andreanalyze
to train new modelsevaluation
pits models against each otherpuzzle
runs the puzzle benchmarkanalysis
includes interactive game analysisgraph
computes the ratio of unique states seen throughout trainingtei
a TEI implementationeee
is a collection of binaries to run Epistemic uncertainty Estimation Experiments (EEE)generalization
trains a hash-based uncertainty estimatorrnd
is the same asgeneralization
, but specifically forrnd
seen_ratio
analyzes the ratio of seen states according to a filled hash-setensemble
trains an ensemble networkutils
utility functions for running experiments
visualize_search
creates a visualization of the search tree used by an agentvisualize_replay_buffer
creates a visualization of the overlap of different replay buffers, as well as the number of seen states at different depthspython
contains miscellaneous Python scriptsaction_space
computes the action space for different board sizesanalyze_search
analyzes search data to figure out which bandit algorithm optimizes best for explorationelo
computes Bayesian Elo from match results (fromevaluation
) and creates a graphextract_from_logs
graphs various data from logsconcat_out
concatenates log outputgenerate_openings
generates random opening positions (for example to use as an opening book for a tournament)get_match_results
extract match results from evaluation logsimproved_policy
compares different improved policy formulasnovelty_per_depth
plots the novelty per depthplot_eee
plots the results of EEEplot_elo_data
plots the Elo datareplay_buffer_uniqueness
plots the replay buffer uniqueness
You will need the C++ Pytorch library (LibTorch). See tch-rs for installation instructions.
It's possible you may not be able to find these versions anymore.
In that case try downloading the newest and update the tch-rs
version in Cargo.toml
.
You may also need to set LIBTORCH_BYPASS_VERSION_CHECK
to 1
.
If you find some version works, please let me know so I can add it here.
Worked:
- Stable (2.4.0), CUDA 12.4, Release
Did not work:
- TODO
Worked:
- TODO
Did not work:
- TODO
To generate the local novelty per depth graph follow these steps:
- Edit
eee/src/seen_ratio.rs
with the path to a trained model, and adjust the imports based on whether it is a SimHash or LCGHash model. - Run
cargo run -p eee -r --bin seen_ratio
for each agent. - Take the output and place it into
python/novelty_per_depth.py
. - Run
python python/novelty_per_depth.py
.
- Acquire a replay buffer by running an undirected agent. (See elo graph instructions.)
- Edit the import in
eee/src/generalization.rs
for the model that you want to test. - Run
cargo run -p eee -r --bin generalization
for each agent, rename the output fileeee_data.csv
for each. - Edit
plot_eee.py
to plot hashes and runpython python/plot_eee.py
- Acquire a replay buffer by running an undirected agent. (See elo graph instructions.)
- Run
cargo run -p eee -r --bin rnd
- Edit
plot_eee.py
to plot RND and runpython python/plot_eee.py
To generate the elo ratings for agents throughout training follow these steps:
- Edit
selfplay/src/main.rs
,reanalyze/src/main.rs
, andlearn/src/main.rs
for the agent and value of beta that is desired. - Compile using
cargo build -r -p selfplay -p reanalyze -p learn
. If exploration is desired, append--features exploration
to the command. - Deploy the agent on a cluster, 1 learn process, 10 selfplay processes, and 10 reanalyze processes.
- Once you have generated checkpoints for all agents, compile the evaluation using
cargo build -r -p evaluation
. - Evaluate agents against each other by deploying evaluation processes.
- Extract the match results out of logs using
python/get_match_results.py
. - Place the match results into
match_results/
and runpython python/elo.py
to plot the elo. - For an easier to edit plot, copy the bayeselo output from
elo.py
intoplot_elo_data.py
in the expected format.
To generate the replay uniqueness graphs follow these steps:
- Train agents using steps 1-3 from the elo graph instructions.
- Edit
graph/main.rs
with paths to the replay files. - Run
cargo run -r -p graph
and see the generated graph ingraph.html
. - For an easier to edit plot, copy the output into
replay_buffer_uniqueness.py
and run withpython python/replay_buffer_uniqueness.py
.