日本語

Egaroucid Free Training Data

Abstract

There are sets of training data for Othello AI generated with Egaroucid.

There are many data, and you can use it in creating your Othello AI.

Website: https://www.egaroucid.nyanyan.dev/en/

GitHub Repository: https://github.com/Nyanyan/Egaroucid

Author: Takuto Yamana ( https://nyanyan.dev/en/ )

Terms of Service

Citation

We welcome the use of this training data in papers and other publications.

When citing this data, please use the following format as a reference, or adjust it to fit the citation format of your publication.

Yamana, Takuto.: Egaroucid Free Training Data, https://www.egaroucid.nyanyan.dev/en/technology/train-data/

Battle Transcript by Egaroucid 7.8.0 lv.11 and Edax 4.5.5 lv.11

Please download and unzip Egaroucid_Train_Data_v0002_0.zip and Egaroucid_Train_Data_v0002_1.zip.

Each folder contains text files named `XXXXXXX.txt`. These files contain Othello game transcripts in `f5d6` format. Each text file contains 10,000 games.

The folder names are numeric. All transcripts within a folder represent games where the first $N$ moves (indicated by the folder name) were played randomly, followed by a match between Othello AIs. This approach was used to ensure a diverse set of match outcomes. This dataset includes 1 million game transcripts for each starting condition, ranging from 8 random opening moves to 59 random opening moves (totaling 52 million games). It is recommended to exclude the positions occurring during the initial random moves from your training data.

The matches were played between Egaroucid for Console 7.8.0 level 11 and Edax 4.5.5 level 11. In each transcript, one side is always played by Egaroucid for Console and the other by Edax, with both AIs playing as black 50% of the time.

Due to the large number of games, the zip file has been split into two. One contains game records with 8 to 33 moves played randomly, and the other contains game records with 34 to 59 moves played randomly.

Training Data by Egaroucid 7.4.0 lv.17 & 7.5.1 lv.17

Please download Egaroucid_Train_Data.zip and unzip it.

You can see text files formatted like XXXXXXX.txt in each directory, and inside it, there are 1 million pairs of board and score.

In each text file, there are 1 million lines. Each line is like:

-XO-OOXOOXX-OXOO-XXOXXOOX-OXOOXOOXOOOXXXO-XOOOXXO-O-OO---OOOX-O- 4

A board is represented with the first 64 characters. The letters are arranged in the order of a1, b1, c1, ..., a2, b2, c2, ..., h8. X represents the player's (the player to make a move) disc on the board, Of represents the opponent's disc, and - represents an empty square.

A number is recorded one space after the string that describes the board. This number represents the player's evaluation value (estimated final stone) on that board state.

The total number of discs on the board and the number of positions included are as follows:

Total # Discs# Data
41
51
63
714
860
9322
101773
1110649
1267245
13434029
14 to 63500000 each
Total25514097

Data for the first 11 moves (total number of discs on the board is 15 or less) was generated using Egaroucid for Console 7.4.0 level 17. It was generated by enumerating all progress up to move 11, calculating the evaluation value for each of those progresses using Egaroucid, and negamaxing the results.

Data from the 12th move onwards (when the total number of discs on the board is 16 or more) was generated by self-plays with Egaroucid for Console 7.5.1 level 17. The score associated with each position is the score at the end of the self-play. The results were varied by playing the opening $7 \leq N \leq 59$ moves randomly. Positions before $N$ moves (where the random plays resulted in bad moves and the final score and the score of the position were far apart) are not included. The positions published were recorded from these games, with priority given to positions immediately after the opening random plays.

Released: 2025/02/02

Self-play Transcripts by Egaroucid 6.3.0 lv.11

Please download Egaroucid_Transcript.zip and unzip it.

You can see text files formatted like XXXXXXX.txt in each directory, and inside it, there are 10 thousand f5d6 formatted transcripts.

Since this data was generated for training Egaroucid's evaluation function, the first $N$ moves are played randomly. This number $N$ is determined by the following method.

1. Set constants $N_{min},N_{max}$

2. In each game, determine $N$ randomly. $N$ satisfies $N_{min}\leq N \leq N_{max}$

3. Play first $N$ moves randomly, then start self-play

The details of the data for each directory are summarized below.

Directory0000_egaroucid_6_3_0_lv11
AI NameEgaroucid for Console 6.3.0
Level11
Number of Games2,000,000
$N_{min}$ 10
$N_{max}$ 19

Released: 2023/07/17