A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks

Burger, Clayton

Title: A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks
Creator: Burger, Clayton
Subject: Neural networks (Computer science)
Subject: Computer algorithms
Date Issued: 2014
Date: 2014
Type: Thesis
Type: Masters
Type: MSc
Identifier: http://hdl.handle.net/10948/3957
Identifier: vital:20495
Description: In the domain of strategic game playing, the use of statistical techniques such as the Upper Confidence for Trees (UCT) algorithm, has become the norm as they offer many benefits over classical algorithms. These benefits include requiring no game-specific strategic knowledge and time-scalable performance. UCT does not incorporate any strategic information specific to the game considered, but instead uses repeated sampling to effectively brute-force search through the game tree or search space. The lack of game-specific knowledge in UCT is thus both a benefit but also a strategic disadvantage. Pattern recognition techniques, specifically Neural Networks (NN), were identified as a means of addressing the lack of game-specific knowledge in UCT. Through a novel hybridisation technique which combines UCT and trained NNs for pruning, the UCTNN algorithm was derived. The NN component of UCT-NN was trained using a UCT self-play scheme to generate game-specific knowledge without the need to construct and manage game databases for training purposes. The UCT-NN algorithm is outlined for pruning in the game of Go-Moku as a candidate case-study for this research. The UCT-NN algorithm contained three major parameters which emerged from the UCT algorithm, the use of NNs and the pruning schemes considered. Suitable methods for finding candidate values for these three parameters were outlined and applied to the game of Go-Moku on a 5 by 5 board. An empirical investigation of the playing performance of UCT-NN was conducted in comparison to UCT through three benchmarks. The benchmarks comprise a common randomly moving opponent, a common UCTmax player which is given a large amount of playing time, and a pair-wise tournament between UCT-NN and UCT. The results of the performance evaluation for 5 by 5 Go-Moku were promising, which prompted an evaluation of a larger 9 by 9 Go-Moku board. The results of both evaluations indicate that the time allocated to the UCT-NN algorithm directly affects its performance when compared to UCT. The UCT-NN algorithm generally performs better than UCT in games with very limited time-constraints in all benchmarks considered except when playing against a randomly moving player in 9 by 9 Go-Moku. In real-time and near-real-time Go-Moku games, UCT-NN provides statistically significant improvements compared to UCT. The findings of this research contribute to the realisation of applying game-specific knowledge to the UCT algorithm.
Format: xviii, 188 leaves
Format: pdf
Publisher: Nelson Mandela Metropolitan University
Publisher: Faculty of Science
Language: English
Rights: Nelson Mandela Metropolitan University

Hits: 1381
Visitors: 1442
Downloads: 84

Collections

NMMU School of Computer Science, Mathematics, Physics and Statistics

		Thumbnail	File	Description	Size	Format
View Details Download			SOURCE1	A hybridisation technique for game playing using the upper confidence for trees algorithm with artificial neural networks	5 MB	Adobe Acrobat PDF	View Details Download