Skip to content

Commit

Permalink
Update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Sopel97 committed May 18, 2020
1 parent 49a6ddf commit ec9a4e1
Showing 1 changed file with 14 additions and 12 deletions.
26 changes: 14 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,46 @@
# chess_pos_db

chess_pos_db is a free, opensource software aiming to provide a high performance database service for aggregation of chess position data from pgn files. It provides a simple TCP interface for interprocess communication, a console interface, and an optional windows gui (see below). The goal is to achieve high query and creation performance.
chess_pos_db is a free, opensource software aiming to provide a high performance database service for aggregation of chess position data from chess games. It provides a simple TCP interface for interprocess communication, a console interface, and an optional windows GUI. The goal is to achieve cutting-edge performance and unmatched possibilities.

For a Windows GUI see [HERE](https://github.com/Sopel97/chess_pos_db_gui). It also contains setup instructions.

Notable features:

- Creation of a position database from pgn files. Data being agreggated:
- 3 data formats allowing different tradeoff between space and information stored.

- Creation of a position database from PGN or BCGN files. Data being agreggated (depends on database format chosen):

- Win count from a given position
- Draw count from a given position
- Loss count from a given position
- Total elo difference for players that reach a given position (average = total / game_count)
- Some PGN Tags (result, event, white, black, plies, ECO) of the first (depending on database format also last) game with this position.

- High performance and little storage

- On modern hardware and fast storage it can process about 10 million positions per second (when creating the database) in a sequential mode. (Parallel mode coming in the future).
- Depending on a format used each position can require little or less than 20 bytes, all that while providing above statistics. Notably db_beta format requires \~17 bytes per position when there is \~6 billion of them.
- On modern hardware and fast storage it can import between 4 to 10 million positions per second depending on the database format chosen.
- Each unique position entry takes between 16 to 32 bytes depending on the format used. For typical large dataset only about 70% or less of the positions are unique.
- Querying is optimized for minimal number of disk seeks. For example for the db_beta format querying all data for a single move and all possible moves takes \~1 second an HDD and is blazingly fast on an SSD.
- Index kept in RAM, uses 500 times less space than the database and accelerates the queries (size configurable).
- Index kept in RAM, uses orders of magnitude times less space than the database and accelerates the queries (size globally configurable).

- High limits

- Can handle trillions of positions (with 1 in a million chance of hash collion)
- Can handle trillions of positions
- Up to 4 billion games (can be increased in the future, and some formats may work with higher numbers)
- No limit on input/output file sizes (can handle large pgn files)
- No limit on input/output file sizes (can handle large PGN files). The only limit is from the filesystem.

- Distinction between continuations (exact move played to arrive at this position) and transpositions (different move played to arrive at this position).
- Local, file based database structure allowing for easy copying and distribution.
- Local, file based database structure allowing for easy archiving, copying, and distribution.
- Extensive configuration. (see cfg/config.json)
- Console user interface
- A simple TCP server allowing interprocess communication.
- A simple TCP server allowing managing and quering databases.

Notable codebase features:

- High performance streaming PGN parser with varying degree of validation
- SAN move parser with varying degree of validation
- Clean chess abstraction.
- Support for BCGN - a more space efficient alternative to PGN.
- Clean chess abstraction. Parts can be used as a chess library.
- External algorithms with support for async IO through worker threads.
- Various integer compression schemes

Expand All @@ -55,8 +59,6 @@ Support for other compilers and other operating systems is planned but there is
Licenses specified in header files or in respective folders.

- libcppjson - included in /lib folder
- infint - included in /lib folder
- robin_hood - included in /lib folder (currently unused)
- xxhash - included in /lib folder
- googletest - vcpkg
- brynet - vcpkg
Expand Down

0 comments on commit ec9a4e1

Please sign in to comment.