The paper and posters (German) are available on Google Drive.
AlphaBing is a lightweight chinese Chess (Xiangqi) engine, implementing modified concepts of AlphaZero that allow the algorithm to be run on limited hardware. The user can challenge the AI on a minimalistic, user-friendly, and intuitive UI.
The project's goal is to improve the accessibility of the Alpha(Go)Zero algorithm for developers. The new, downscaled, and highly optimized algorithm's full functionality and great efficiency on consumer hardware was demonstrated in the domain of xiangqi.
At its core, AlphaBing is based on the amalgamation of traditional methods of AI (such as optimized alpha-beta-search) with innovative concepts of reinforcement learning in order to create an agent that allows the strengths of each method to compensate for the detrimental weak points of the other.
Its development is motivated by the inaccessibility of AlphaZero's codebase to the community and the unaffordable resources required just to run the system. AlphaBing runs smoothly on a single device and plays with adaptable skill levels.
For avid researchers, this repo comes with a number of visualization scripts using matplotlib.
git clone https://github.com/Simuschlatz/AlphaBing
cd AlphaBing
pip install -r requirements.txt
Mac, Windows & Linux
conda env create -f environment.yml
Apple Silicon
conda env create -f metal.yml
Activate the environment
conda activate cheapchess
python3 main.py --pipeline --nui
Open another terminal, then run:
cd [path to this directory]
conda activate cheapchess
tensorboard --logdir core/engine/ai/selfplay_rl/checkpoints/logs
python3 main.py
usage: main.py [-h] [--chinese] [--perft] [--pipeline] [--eval] [--nui] [--black] [--second] [{ab,az,abz}] [cores] [time]
positional arguments:
{ab,az,abz} AI-agent playing in interactive environment (ab: Alpha-Beta, az: AlphaZero, abz: Alpha-Beta-Zero) (default: ab)
cores maximum number of processors to use for pipeline (default: multiprocessing.cpu_count())
time time on the clock in minutes (default: 5)
options:
-h, --help show this help message and exit
--chinese rendering chinese style UI
--perft run performance tests for move generation speed and accuracy
--pipeline run the self-play and training pipeline (to evaluate, see --eval)
--eval add evaluation to the pipeline
--nui no UI
--black play black
--second move second
├── LICENSE
├── README.md
├── assets
├── core
│ ├── checkpoints
│ │ └── examples
│ ├── engine
│ │ ├── AI
│ │ │ ├── ABMM
│ │ │ │ ├── AI_diagnostics.py
│ │ │ │ ├── __init__.py
│ │ │ │ ├── agent.py
│ │ │ │ ├── eval_utility.py
│ │ │ │ ├── move_ordering.py
│ │ │ │ ├── piece_square_tables.py
│ │ │ │ ├── search.py
│ │ │ │ └── transposition_table.py
│ │ │ ├── AlphaZero
│ │ │ │ ├── MCTS.py
│ │ │ │ ├── __init__.py
│ │ │ │ ├── agent.py
│ │ │ │ ├── checkpoints
│ │ │ │ │ ├── checkpoint_new.h5
│ │ │ │ │ ├── examples
│ │ │ │ │ └── logs
│ │ │ │ ├── config.py
│ │ │ │ ├── model.py
│ │ │ │ ├── nnet.py
│ │ │ │ └── selfplay.py
│ │ │ ├── EvaluateAgent
│ │ │ │ ├── __init__.py
│ │ │ │ └── evaluate.py
│ │ │ ├── SLEF
│ │ │ │ ├── README.md
│ │ │ │ ├── __init__.py
│ │ │ │ ├── eval_data_black.csv
│ │ │ │ ├── eval_data_collection.py
│ │ │ │ └── eval_data_red.csv
│ │ │ ├── agent_interface.py
│ │ │ └── mixed_agent.py
│ │ ├── UI.py
│ │ ├── __init__.py
│ │ ├── board.py
│ │ ├── clock.py
│ │ ├── config.py
│ │ ├── data_init.py
│ │ ├── fast_move_gen.py
│ │ ├── game_manager.py
│ │ ├── move_generator.py
│ │ ├── piece.py
│ │ ├── precomputed_move_data.py
│ │ ├── test.py
│ │ ├── tt_entry.py
│ │ ├── verbal_command_handler.py
│ │ └── zobrist_hashing.py
│ └── utils
│ ├── __init__.py
│ ├── board_utils.py
│ ├── claim_copyright.py
│ ├── modify_pst.py
│ ├── perft_utility.py
│ ├── select_agent.py
│ └── timer.py
├── environment.yml
├── main.py
├── metal.yml
└── requirements.txt
- Move generation
- A novel optimization of Zobrist Hashing
- FEN utility
- Bitboard representation
- UI / UX - pygame, provisional, drag & drop, sound-effects, move-highlighting etc.
- Piece-square-table implementation
- Minimax-Search with Alpha-Beta-Pruning
- Move ordering
- Multiprocessing
- Transposition Tables
- Iterative Deepening
- Deep Convolutional ResNet Architecture
- Fast MCTS
- Self-Play policy iteration and Q-Learning
- Training Pipeline
- Evaluation - Elo & win-rate diagnostics
- Parallelism with tensorflow sessions - parallelized pipeline
- Train the agent on server (in progress)