AlphaGo.jl is pure Julia implementation of AlphaGo Zero using Flux.jl.
To install this package simply run
pkg> add https://github.com/tejank10/AlphaGo.jl
using AlphaGo
Making an environment of Go is simple
env = GoEnv(9)
Here 9 is the size of board i.e., a 9x9 board is created.
A B C D E F G H J
9 . . . . . . . . . 9
8 . . . . . . . . . 8
7 . . . . . . . . . 7
6 . . . . . . . . . 6
5 . . . . . . . . . 5
4 . . . . . . . . . 4
3 . . . . . . . . . 3
2 . . . . . . . . . 2
1 . . . . . . . . . 1
A B C D E F G H J
Move: 0. Captures X: 0 O: 0
To Play: X(BLACK)
Training is done using train()
method. train()
method is used by the user to train the model based on the following parameters:
env
num_games
: Number of self-play games to be played Optional arguments:memory_size
: Size of the memory bufferbatch_size
epochs
: Number of epochs to train the data onckp_freq
: Frequecy of saving the model and weightstower_height
: AlphaGo Zero Architecture uses residual networks stacked together. This is called a tower of residual networks.tower_height
specifies how many residual blocks to be stacked.model
: Object of typeNeuralNet
readouts
: number of readouts byMCTSPlayer
start_training_after
: Number of games after which training will be started
The network can be tested to play against humans by using the play()
method. play()
takes following arguments:
env
nn
: an object of typeNeuralNet
tower_height
num_readouts
mode
: It specifies human will play as Black or white. Ifmode
is 0 then human is Black, else White.
- Structure for memory buffer
- GUI