Skip to content
Kensuke Matsuzaki edited this page Sep 1, 2017 · 2 revisions

Train your own Policy Network and Value Network.

Utilities for training is availabel at https://github.com/zakki/rn-tools

Requires

  • Windows
  • Cygwin
  • Perl
  • Java

Step

  1. Convert KGS/GoGod/Tygem sgf files to gtp files which interleave custom gtp commands: _store/_dump/_clear with sgfvar2gtp2.sh Filtering by komi and rank are hard coded in sgfvar2gtp2.pl.
  2. Run ray with converted gtp file to generate 'data.txt'. This generates 100GB~2TB data file.
  3. Shuffle and split data.txt using jshuf.
  4. Run cntk. Model definition is https://github.com/zakki/Ray/blob/nn/cntk/. Since CNTK can't handle over 100GBs data on my PC, I stop cntk every some epochs and I switch data.txt using loop-v20.sh

Step 4 is needed by limitation of CNTK 1. If you use CNTK 2, I think there are more sophisticated way that uses Python API.

Version

model3.bin

"exp-gnugo-value" branch (e.g. Rn.4.9.1) outputs data.txt for model3.bin that is trained by ResNetV97.bs and modified TrainV50.cntk.

safety = [
  dim = 2888
  format = "sparse"
]

model2.bin

"nn" branch (e.g. Rn.4.20) outputs data.txt for model2.bin that is trained by TrainV50.cntk and ResNetV91b.bs.

model.bin

Rn.3.x outputs data.txt for model.bin that is trained by ResNetV25.ndl TrainValueResNetV20.cntk

Clone this wiki locally