Augmenting the Moore-Penrose generalised Inverse to train neural networks

Fang, Bobby

Title: Augmenting the Moore-Penrose generalised Inverse to train neural networks
Creator: Fang, Bobby
Subject: Neural networks (Computer science)
Subject: Machine learning
Subject: Mathematical optimization -- Computer programs
Date Issued: 2024-04
Date: 2024-04
Type: Master's theses
Type: text
Identifier: http://hdl.handle.net/10948/63755
Identifier: vital:73595
Description: An Extreme Learning Machine (ELM) is a non-iterative and fast feedforward neural network training algorithm which uses the Moore-Penrose generalised inverse of a matrix to compute the weights of the output layer of the neural network, using a random initialisation for the hidden layer. While ELM has been used to train feedforward neural networks, the effectiveness of the MP generalised to train recurrent neural networks is yet to be investigated. The primary aim of this research was to investigate how biases in the output layer and the MP generalised inverse can be used to train recurrent neural networks. To accomplish this, the Bias Augmented ELM (BA-ELM), which concatenated the hidden layer output matrix with a ones-column vector to simulate the biases in the output layer, was proposed. A variety of datasets generated from optimisation test functions, as well as using real-world regression and classification datasets, were used to validate BA-ELM. The results showed in specific circumstances that BA-ELM was able to perform better than ELM. Following this, Recurrent ELM (R-ELM) was proposed which uses a recurrent hidden layer instead of a feedforward hidden layer. Recurrent neural networks also rely on having functional feedback connections in the recurrent layer. A hybrid training algorithm, Recurrent Hybrid ELM (R-HELM), was proposed, which uses a gradient-based algorithm to optimise the recurrent layer and the MP generalised inverse to compute the output weights. The evaluation of R-ELM and R-HELM algorithms were carried out using three different recurrent architectures on two recurrent tasks derived from the Susceptible- Exposed-Infected-Removed (SEIR) epidemiology model. Various training hyperparameters were evaluated through hyperparameter investigations to investigate their effectiveness on the hybrid training algorithm. With optimal hyperparameters, the hybrid training algorithm was able to achieve better performance than the conventional gradient-based algorithm.
Description: Thesis (MSc) -- Faculty of Science, School of Computer Science, Mathematics, Physics and Statistics, 2024
Format: computer
Format: online resource
Format: application/pdf
Format: 1 online resource (xiii, 135 pages)
Format: pdf
Publisher: Nelson Mandela University
Publisher: Faculty of Science
Language: English
Rights: Nelson Mandela University
Rights: All Rights Reserved
Rights: Open Access

Hits: 1014
Visitors: 1030
Downloads: 26

Collections

NMU School of Computer Science, Mathematics, Physics and Statistics

		Thumbnail	File	Description	Size	Format
View Details Download			SOURCE1	Fang, B.pdf	7 MB	Adobe Acrobat PDF	View Details Download