I used 2 benchmarks for evaluating Egaroucid. The first one is The FFO endgame test suite. This test is for the speed of endgame complete search. The second one is the matches against old versions of Egaroucid and Edax 4.4. To test the strength of its evaluation function, I used no book, and used XOT for the starting positions.
The endgame search is evaluated by 3 features:
The most important feature for users is the search time. This feature is shown as the actual time (second) to solve The FFO endgame test suite #40 to #59. This value is good if it decreases.
To shorten the search time, we can do two things: decrease the number of nodes and increase the number of nodes visited in a unit time.
There are some graphs of results of The FFO endgame test suite on Core i9 13900K.
It is the best way to evaluate the strength of Othello AI that we have battles with some engines. The result of battles by each version of Egaroucid and Edax 4.4 is below.
To avoid same lines, I used XOT as the beginning board. Each battle is done in level 1 (lookahead depth is 1 for the midgame, 2 for the endgame).
Name | Winning Rate |
---|---|
7.0.0 | 0.5643 |
6.5.0 | 0.5648 |
6.4.0 | 0.4980 |
6.3.0 | 0.4598 |
6.1.0 | 0.5113 |
6.0.0 | 0.4592 |
Edax | 0.4425 |
The further log is available here.
Egaroucid 6.2.0 is omitted because it has the same evaluation function as 6.3.0.
There are detailed benchmarks for each version including older versions.
Version | Date |
---|---|
7.0.0 | 2024/04/17 |
6.5.0 | 2023/10/25 |
6.4.0 | 2023/09/01 |
6.3.0 | 2023/07/09 |
6.2.0 | 2023/03/15 |
6.1.0 | 2022/12/23 |
6.0.0 | 2022/10/10 |
5.10.0 | 2022/06/08 |
5.9.0 | 2022/06/07 |
5.8.0 | 2022/05/13 |
5.7.0 | 2022/03/26 |
5.5.0/5.6.0 | 2022/03/16 |
5.4.1 | 2022/03/02 |
I wrote Technology Explanation only in Japanese. Please translate by yourself.
Huge dataset of games played by Egaroucid is available. Please see Download Transcript page.