Neurohex: A Deep Q-learning Hex Agent

arxiv: 1604.07097 · v2 · pith:TVCPIQMVnew · submitted 2016-04-24 · 💻 cs.AI

Neurohex: A Deep Q-learning Hex Agent

Kenny Young , Ryan Hayward , Gautham Vasan This is my paper

classification 💻 cs.AI

keywords deepgameneurohexq-learninglearningplayersecondachieves

0 comments p. Extension

pith:TVCPIQMV Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{TVCPIQMV}

Prints a linked pith:TVCPIQMV badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

DeepMind's recent spectacular success in using deep convolutional neural nets and machine learning to build superhuman level agents --- e.g. for Atari games via deep Q-learning and for the game of Go via Reinforcement Learning --- raises many questions, including to what extent these methods will succeed in other domains. In this paper we consider DQL for the game of Hex: after supervised initialization, we use selfplay to train NeuroHex, an 11-layer CNN that plays Hex on the 13x13 board. Hex is the classic two-player alternate-turn stone placement game played on a rhombus of hexagonal cells in which the winner is whomever connects their two opposing sides. Despite the large action and state space, our system trains a Q-network capable of strong play with no search. After two weeks of Q-learning, NeuroHex achieves win-rates of 20.4% as first player and 2.1% as second player against a 1-second/move version of MoHex, the current ICGA Olympiad Hex champion. Our data suggests further improvement might be possible with more training time.

This paper has not been read by Pith yet.

Neurohex: A Deep Q-learning Hex Agent

discussion (0)