Protein secondary structure prediction using neural networks and support vector machines
- Authors: Tsilo, Lipontseng Cecilia
- Date: 2009
- Subjects: Neural networks (Computer science) , Support vector machines , Proteins -- Structure -- Mathematical models
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:5569 , http://hdl.handle.net/10962/d1002809 , Neural networks (Computer science) , Support vector machines , Proteins -- Structure -- Mathematical models
- Description: Predicting the secondary structure of proteins is important in biochemistry because the 3D structure can be determined from the local folds that are found in secondary structures. Moreover, knowing the tertiary structure of proteins can assist in determining their functions. The objective of this thesis is to compare the performance of Neural Networks (NN) and Support Vector Machines (SVM) in predicting the secondary structure of 62 globular proteins from their primary sequence. For each NN and SVM, we created six binary classifiers to distinguish between the classes’ helices (H) strand (E), and coil (C). For NN we use Resilient Backpropagation training with and without early stopping. We use NN with either no hidden layer or with one hidden layer with 1,2,...,40 hidden neurons. For SVM we use a Gaussian kernel with parameter fixed at = 0.1 and varying cost parameters C in the range [0.1,5]. 10- fold cross-validation is used to obtain overall estimates for the probability of making a correct prediction. Our experiments indicate for NN and SVM that the different binary classifiers have varying accuracies: from 69% correct predictions for coils vs. non-coil up to 80% correct predictions for stand vs. non-strand. It is further demonstrated that NN with no hidden layer or not more than 2 hidden neurons in the hidden layer are sufficient for better predictions. For SVM we show that the estimated accuracies do not depend on the value of the cost parameter. As a major result, we will demonstrate that the accuracy estimates of NN and SVM binary classifiers cannot distinguish. This contradicts a modern belief in bioinformatics that SVM outperforms other predictors.
- Full Text:
- Date Issued: 2009
- Authors: Tsilo, Lipontseng Cecilia
- Date: 2009
- Subjects: Neural networks (Computer science) , Support vector machines , Proteins -- Structure -- Mathematical models
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:5569 , http://hdl.handle.net/10962/d1002809 , Neural networks (Computer science) , Support vector machines , Proteins -- Structure -- Mathematical models
- Description: Predicting the secondary structure of proteins is important in biochemistry because the 3D structure can be determined from the local folds that are found in secondary structures. Moreover, knowing the tertiary structure of proteins can assist in determining their functions. The objective of this thesis is to compare the performance of Neural Networks (NN) and Support Vector Machines (SVM) in predicting the secondary structure of 62 globular proteins from their primary sequence. For each NN and SVM, we created six binary classifiers to distinguish between the classes’ helices (H) strand (E), and coil (C). For NN we use Resilient Backpropagation training with and without early stopping. We use NN with either no hidden layer or with one hidden layer with 1,2,...,40 hidden neurons. For SVM we use a Gaussian kernel with parameter fixed at = 0.1 and varying cost parameters C in the range [0.1,5]. 10- fold cross-validation is used to obtain overall estimates for the probability of making a correct prediction. Our experiments indicate for NN and SVM that the different binary classifiers have varying accuracies: from 69% correct predictions for coils vs. non-coil up to 80% correct predictions for stand vs. non-strand. It is further demonstrated that NN with no hidden layer or not more than 2 hidden neurons in the hidden layer are sufficient for better predictions. For SVM we show that the estimated accuracies do not depend on the value of the cost parameter. As a major result, we will demonstrate that the accuracy estimates of NN and SVM binary classifiers cannot distinguish. This contradicts a modern belief in bioinformatics that SVM outperforms other predictors.
- Full Text:
- Date Issued: 2009
An analysis of neural networks and time series techniques for demand forecasting
- Authors: Winn, David
- Date: 2007
- Subjects: Time-series analysis , Neural networks (Computer science) , Artificial intelligence , Marketing -- Management , Marketing -- Data processing , Marketing -- Statistical methods , Consumer behaviour
- Language: English
- Type: Thesis , Masters , MCom
- Identifier: vital:5572 , http://hdl.handle.net/10962/d1004362 , Time-series analysis , Neural networks (Computer science) , Artificial intelligence , Marketing -- Management , Marketing -- Data processing , Marketing -- Statistical methods , Consumer behaviour
- Description: This research examines the plausibility of developing demand forecasting techniques which are consistently and accurately able to predict demand. Time Series Techniques and Artificial Neural Networks are both investigated. Deodorant sales in South Africa are specifically studied in this thesis. Marketing techniques which are used to influence consumer buyer behaviour are considered, and these factors are integrated into the forecasting models wherever possible. The results of this research suggest that Artificial Neural Networks can be developed which consistently outperform industry forecasting targets as well as Time Series forecasts, suggesting that producers could reduce costs by adopting this more effective method.
- Full Text:
- Date Issued: 2007
- Authors: Winn, David
- Date: 2007
- Subjects: Time-series analysis , Neural networks (Computer science) , Artificial intelligence , Marketing -- Management , Marketing -- Data processing , Marketing -- Statistical methods , Consumer behaviour
- Language: English
- Type: Thesis , Masters , MCom
- Identifier: vital:5572 , http://hdl.handle.net/10962/d1004362 , Time-series analysis , Neural networks (Computer science) , Artificial intelligence , Marketing -- Management , Marketing -- Data processing , Marketing -- Statistical methods , Consumer behaviour
- Description: This research examines the plausibility of developing demand forecasting techniques which are consistently and accurately able to predict demand. Time Series Techniques and Artificial Neural Networks are both investigated. Deodorant sales in South Africa are specifically studied in this thesis. Marketing techniques which are used to influence consumer buyer behaviour are considered, and these factors are integrated into the forecasting models wherever possible. The results of this research suggest that Artificial Neural Networks can be developed which consistently outperform industry forecasting targets as well as Time Series forecasts, suggesting that producers could reduce costs by adopting this more effective method.
- Full Text:
- Date Issued: 2007
- «
- ‹
- 1
- ›
- »