@@ -3,13 +3,13 @@ Minigo: A minimalist Go engine modeled after AlphaGo Zero, built on MuGo
3
3
4
4
This is a pure Python implementation of a neural-network based Go AI, using
5
5
TensorFlow. While inspired by Deepmind's AlphaGo algorithm, this project is not
6
- a Deepmind project.
6
+ a Deepmind project nor is it affiliated with the official AlphaGo project .
7
7
8
8
### This is NOT an official version of AlphaGo ###
9
9
10
10
Repeat, * this is not the official AlphaGo program by DeepMind* . This is an
11
11
independent effort by Go enthusiasts to replicate the results of the AlphaGo
12
- Zero paper ("Mastering the Game of Go without Human Knowledge" * Nature* ), with
12
+ Zero paper ("Mastering the Game of Go without Human Knowledge, " * Nature* ), with
13
13
some resources generously made available by Google.
14
14
15
15
Minigo is based off of Brian Lee's "MuGo" -- a pure Python implementation of the
@@ -44,10 +44,17 @@ establishes itself as the top Go AI. Instead, we strive for a readable,
44
44
understandable implementation that can benefit the community, even if that
45
45
means our implementation is not as fast or efficient as possible.
46
46
47
+ While this product might produce such a strong model, we hope to focus on the
48
+ process. Remember, getting there is half the fun :)
49
+
47
50
We hope this project is an accessible way for interested developers to have
48
51
access to a strong Go model with an easy-to-understand platform of python code
49
52
available for extension, adaptation, etc.
50
53
54
+ If you'd like to read about our experiences training models, see RESULTS.md
55
+
56
+ To see our guidelines for contributing, see CONTRIBUTING.md
57
+
51
58
Getting Started
52
59
===============
53
60
@@ -76,8 +83,8 @@ the dependencies:
76
83
pip3 install -r requirements.txt
77
84
```
78
85
79
- If you wish to run on GPU you must install CUDA 8.0 or later (see TensorFlow
80
- documentation).
86
+ The ` requirements.txt ` file assumes you'll use a GPU; if you wish to run on GPU
87
+ you must install CUDA 8.0 or later (see TensorFlow documentation).
81
88
82
89
If you don't want to run on GPU or don't have one, you can downgrade:
83
90
@@ -124,23 +131,29 @@ All commands are compatible with either Google Cloud Storage as a remote file
124
131
system, or your local file system. The examples here use GCS, but local file
125
132
paths will work just as well.
126
133
127
- To use GCS, set the ` BUCKET_NAME ` variable and authenticate. Otherwise, all
128
- commands fetching files from GCS will hang.
134
+ To use GCS, set the ` BUCKET_NAME ` variable and authenticate via ` gcloud login ` .
135
+ Otherwise, all commands fetching files from GCS will hang.
129
136
137
+ For instance, this would set a bucket, authenticate, and then look for the most
138
+ recent model.
130
139
``` bash
131
140
export BUCKET_NAME=your_bucket;
132
141
gcloud auth application-default login
133
142
gsutil ls gs://minigo/models | tail -3
134
143
```
135
144
136
- Which might look like
145
+ Which might look like:
137
146
138
147
```
139
148
gs://$BUCKET_NAME/models/000193-trusty.data-00000-of-00001
140
149
gs://$BUCKET_NAME/models/000193-trusty.index
141
150
gs://$BUCKET_NAME/models/000193-trusty.meta
142
151
```
143
152
153
+ These three files comprise the model, and commands that take a model as an
154
+ argument usually need the path to the model basename, e.g.
155
+ ` gs://$BUCKET_NAME/models/000193-trusty `
156
+
144
157
You'll need to copy them to your local disk. This fragment copies the latest
145
158
model to the directory specified by ` MINIGO_MODELS `
146
159
@@ -212,8 +225,8 @@ Overview
212
225
--------
213
226
214
227
The following sequence of commands will allow you to do one iteration of
215
- reinforcement learning on 9x9. These are the basic commands used in the
216
- kubernetified version used to produce the models and games referenced above.
228
+ reinforcement learning on 9x9. These are the basic commands used to produce the
229
+ models and games referenced above.
217
230
218
231
The commands are
219
232
- bootstrap: initializes a random model
@@ -231,7 +244,7 @@ This command creates a random model, which appears at .
231
244
232
245
``` bash
233
246
export MODEL_NAME=000000-bootstrap
234
- python3 main.py bootstrap gs://$BUCKET_NAME /models/$MODEL_NAME -n $BOARD_SIZE
247
+ python3 main.py bootstrap gs://$BUCKET_NAME /models/$MODEL_NAME
235
248
```
236
249
237
250
Self-play
@@ -245,29 +258,28 @@ gs://$BUCKET_NAME/data/selfplay/$MODEL_NAME/local_worker/*.tfrecord.zz
245
258
gs://$BUCKET_NAME/sgf/$MODEL_NAME/local_worker/*.sgf
246
259
```
247
260
248
- (-n 9 makes 9x9 games)
249
261
``` bash
250
262
python3 main.py selfplay gs://$BUCKET_NAME /models/$MODEL_NAME \
251
263
--readouts 10 \
252
- --games 8 \
253
- -v 3 -n 9 \
264
+ -v 3 \
254
265
--output-dir=gs://$BUCKET_NAME /data/selfplay/$MODEL_NAME /local_worker \
255
266
--output-sgf=gs://$BUCKET_NAME /sgf/$MODEL_NAME /local_worker
256
267
```
257
- (-n 9 makes it play 9x9 games)
258
268
259
269
Gather
260
270
------
261
271
262
- This command takes multiple tfrecord.zz files (which will probably be KBs in size)
263
- and shuffles them into tfrecord.zz files that are ~ 100 MB in size.
264
-
265
272
```
266
273
python3 main.py gather
267
274
```
268
275
276
+ This command takes multiple tfrecord.zz files (which will probably be KBs in size)
277
+ and shuffles them into tfrecord.zz files that are ~ 100 MB in size.
278
+
269
279
Gathering is done according to model numbers, so that games generated by
270
- one model stay together. The output will be in the directories
280
+ one model stay together. By default, ` rl_loop.py ` will use directories
281
+ specified by the environment variable ` BUCKET_NAME ` , set at the top of
282
+ ` rl_loop.py `
271
283
272
284
```
273
285
gs://$BUCKET_NAME/data/training_chunks/$MODEL_NAME-{chunk_number}.tfrecord.zz
@@ -295,7 +307,6 @@ python3 main.py train gs://$BUCKET_NAME/data/training_chunks \
295
307
--load-file=gs://$BUCKET_NAME/models/000000-bootstrap \
296
308
--generation-num=1 \
297
309
--logdir=path/to/tensorboard/logs \
298
- -n 9
299
310
```
300
311
301
312
The updated model weights will be saved at the end. (TODO: implement some sort
@@ -313,8 +324,7 @@ Running Minigo on a Cluster
313
324
As you might notice, playing games is fairly slow. One way to speed up playing
314
325
games is to run Minigo on many computers simultaneously. Minigo was originally
315
326
trained by containerizing these worker jobs and running them on a Kubernetes
316
- cluster, hosted on the Google Cloud Platform (TODO: links for installing GCP
317
- SDK, kubectl, etc.)
327
+ cluster, hosted on the Google Cloud Platform.
318
328
319
329
* NOTE* These commands will result in VMs being created and will result in
320
330
charges to your GCP account! * Proceed with care!*
@@ -371,7 +381,7 @@ Bringing up a cluster
371
381
---------------------
372
382
373
383
0 . Switch to the ` cluster ` directory
374
- 1 . Set the common environment variables in ` common ` corresponding to your GCP project and bucket names.
384
+ 1 . Set the common environment variables in ` common.sh ` corresponding to your GCP project and bucket names.
375
385
2 . Run ` deploy ` , which will:
376
386
a. Create a bucket
377
387
b. Create a service account
@@ -476,8 +486,8 @@ To kill the job,
476
486
envsubst < player.yaml | kubectl delete -f -
477
487
```
478
488
479
- Preflight checks for a training run.
480
- ====================================
489
+ Preflight checklist for a training run.
490
+ =======================================
481
491
482
492
483
493
Setting up the selfplay cluster
@@ -519,7 +529,7 @@ Setting up the selfplay cluster
519
529
Useful things for the selfplay cluster
520
530
--------------------------------------
521
531
522
- * Getting a list of the selfplay games ordered by most recent start
532
+ * Getting a list of the selfplay games ordered by start time.
523
533
```
524
534
kubectl get po --sort-by=.status.startTime
525
535
```
@@ -537,13 +547,9 @@ Useful things for the selfplay cluster
537
547
```
538
548
539
549
540
- Setting up logging via stackdriver, plus metrics, bla bla.
541
-
542
-
543
- If you've run rsync to collect a set of SGF files (cheatsheet: `python3
544
- rl_loop.py smart-rsync --source-dir="gs://$BUCKET_NAME/sgf/" --from-model-num 0
545
- --game-dir=sgf/`), here are some handy
546
- bashisms to run on them:
550
+ If you've run rsync to collect a set of SGF files (cheatsheet: `gsutil -m cp -r
551
+ gs://$BUCKET_NAME/sgf/$MODEL_NAME sgf/`), here are some handy
552
+ bash fragments to run on them:
547
553
548
554
* Find the proportion of games won by one color:
549
555
```
@@ -567,7 +573,5 @@ bashisms to run on them:
567
573
\; | ministat
568
574
```
569
575
570
-
571
- etc...
572
-
573
-
576
+ Also check the 'oneoffs' directory for interesting scripts to analyze e.g. the
577
+ resignation threshold.
0 commit comments