Update readme

Sopel97 · May 18, 2020 · ec9a4e1 · ec9a4e1
1 parent 49a6ddf
commit ec9a4e1
Showing 1 changed file with 14 additions and 12 deletions.
diff --git a/README.md b/README.md
@@ -1,42 +1,46 @@
 # chess_pos_db
 
-chess_pos_db is a free, opensource software aiming to provide a high performance database service for aggregation of chess position data from pgn files. It provides a simple TCP interface for interprocess communication, a console interface, and an optional windows gui (see below). The goal is to achieve high query and creation performance.
+chess_pos_db is a free, opensource software aiming to provide a high performance database service for aggregation of chess position data from chess games. It provides a simple TCP interface for interprocess communication, a console interface, and an optional windows GUI. The goal is to achieve cutting-edge performance and unmatched possibilities.
 
 For a Windows GUI see [HERE](https://github.com/Sopel97/chess_pos_db_gui). It also contains setup instructions.
 
 Notable features:
 
-- Creation of a position database from pgn files. Data being agreggated:
+- 3 data formats allowing different tradeoff between space and information stored.
+
+- Creation of a position database from PGN or BCGN files. Data being agreggated (depends on database format chosen):
 
     - Win count from a given position
     - Draw count from a given position
     - Loss count from a given position
+    - Total elo difference for players that reach a given position (average = total / game_count)
     - Some PGN Tags (result, event, white, black, plies, ECO) of the first (depending on database format also last) game with this position.
 
 - High performance and little storage
 
-    - On modern hardware and fast storage it can process about 10 million positions per second (when creating the database) in a sequential mode. (Parallel mode coming in the future).
-    - Depending on a format used each position can require little or less than 20 bytes, all that while providing above statistics. Notably db_beta format requires \~17 bytes per position when there is \~6 billion of them.
+    - On modern hardware and fast storage it can import between 4 to 10 million positions per second depending on the database format chosen.
+    - Each unique position entry takes between 16 to 32 bytes depending on the format used. For typical large dataset only about 70% or less of the positions are unique.
     - Querying is optimized for minimal number of disk seeks. For example for the db_beta format querying all data for a single move and all possible moves takes \~1 second an HDD and is blazingly fast on an SSD.
-    - Index kept in RAM, uses 500 times less space than the database and accelerates the queries (size configurable).
+    - Index kept in RAM, uses orders of magnitude times less space than the database and accelerates the queries (size globally configurable).
 
 - High limits
 
-    - Can handle trillions of positions (with 1 in a million chance of hash collion)
+    - Can handle trillions of positions
     - Up to 4 billion games (can be increased in the future, and some formats may work with higher numbers)
-    - No limit on input/output file sizes (can handle large pgn files)
+    - No limit on input/output file sizes (can handle large PGN files). The only limit is from the filesystem.
 
 - Distinction between continuations (exact move played to arrive at this position) and transpositions (different move played to arrive at this position).
-- Local, file based database structure allowing for easy copying and distribution.
+- Local, file based database structure allowing for easy archiving, copying, and distribution.
 - Extensive configuration. (see cfg/config.json)
 - Console user interface
-- A simple TCP server allowing interprocess communication.
+- A simple TCP server allowing managing and quering databases.
 
 Notable codebase features:
 
 - High performance streaming PGN parser with varying degree of validation
 - SAN move parser with varying degree of validation
-- Clean chess abstraction.
+- Support for BCGN - a more space efficient alternative to PGN.
+- Clean chess abstraction. Parts can be used as a chess library.
 - External algorithms with support for async IO through worker threads.
 - Various integer compression schemes
 
@@ -55,8 +59,6 @@ Support for other compilers and other operating systems is planned but there is
 Licenses specified in header files or in respective folders.
 
 - libcppjson - included in /lib folder
-- infint - included in /lib folder
-- robin_hood - included in /lib folder (currently unused)
 - xxhash - included in /lib folder
 - googletest - vcpkg
 - brynet - vcpkg