Bayesian logistic regression models for credit scoring
- Authors: Webster, Gregg
- Date: 2011
- Subjects: Bayesian statistical decision theory Credit scoring systems Regression analysis Logistic regression analysis Monte Carlo method Markov processes Financial institutions
- Language: English
- Type: Thesis , Masters , MCom
- Identifier: vital:5574 , http://hdl.handle.net/10962/d1005538
- Description: The Bayesian approach to logistic regression modelling for credit scoring is useful when there are data quantity issues. Data quantity issues might occur when a bank is opening in a new location or there is change in the scoring procedure. Making use of prior information (available from the coefficients estimated on other data sets, or expert knowledge about the coefficients) a Bayesian approach is proposed to improve the credit scoring models. To achieve this, a data set is split into two sets, “old” data and “new” data. Priors are obtained from a model fitted on the “old” data. This model is assumed to be a scoring model used by a financial institution in the current location. The financial institution is then assumed to expand into a new economic location where there is limited data. The priors from the model on the “old” data are then combined in a Bayesian model with the “new” data to obtain a model which represents all the available information. The predictive performance of this Bayesian model is compared to a model which does not make use of any prior information. It is found that the use of relevant prior information improves the predictive performance when the size of the “new” data is small. As the size of the “new” data increases, the importance of including prior information decreases
- Full Text:
- Date Issued: 2011
- Authors: Webster, Gregg
- Date: 2011
- Subjects: Bayesian statistical decision theory Credit scoring systems Regression analysis Logistic regression analysis Monte Carlo method Markov processes Financial institutions
- Language: English
- Type: Thesis , Masters , MCom
- Identifier: vital:5574 , http://hdl.handle.net/10962/d1005538
- Description: The Bayesian approach to logistic regression modelling for credit scoring is useful when there are data quantity issues. Data quantity issues might occur when a bank is opening in a new location or there is change in the scoring procedure. Making use of prior information (available from the coefficients estimated on other data sets, or expert knowledge about the coefficients) a Bayesian approach is proposed to improve the credit scoring models. To achieve this, a data set is split into two sets, “old” data and “new” data. Priors are obtained from a model fitted on the “old” data. This model is assumed to be a scoring model used by a financial institution in the current location. The financial institution is then assumed to expand into a new economic location where there is limited data. The priors from the model on the “old” data are then combined in a Bayesian model with the “new” data to obtain a model which represents all the available information. The predictive performance of this Bayesian model is compared to a model which does not make use of any prior information. It is found that the use of relevant prior information improves the predictive performance when the size of the “new” data is small. As the size of the “new” data increases, the importance of including prior information decreases
- Full Text:
- Date Issued: 2011
Cointegration in equity markets: a comparison between South African and major developed and emerging markets
- Authors: Petrov, Pavel
- Date: 2011
- Subjects: Cointegration Stock exchanges -- South Africa Stock exchanges -- Developing countries Stock exchanges -- Developed countries South Africa -- Economic conditions Portfolio management -- South Africa Econometrics Autoregression (Statistics)
- Language: English
- Type: Thesis , Masters , MCom
- Identifier: vital:5575 , http://hdl.handle.net/10962/d1005539
- Description: Cointegration has important implications for portfolio diversification. One of these is that in order to spread risk it is advisable to invest in markets that are not cointegrated. Over the last several decades communication technology has made the world a smaller place and hence cointegration in equity markets has become more prevalent. The bulk of research into cointegration focuses on developed and Asian markets, with little research been done on African markets. This study compares the Engle-Granger and Johansen tests for cointegration and uses them to calculate the level of cointegration between South African and other global equity markets. Each market is compared pair-wise with South Africa and the results have been that in general South Africa is cointegrated with other emerging markets but not really with African nor developed markets. Short-run analysis with the error correction was carried out and showed that in general markets respond slowly to any disequilibrium. Innovation accounting methods showed that the country placed first in Cholesky ordering dominates the other one. Multivariate cointegration was carried out using three selections of 4, 6 and 8 market portfolios. One of the markets was SA and the others were all chosen based on the criteria that they are not pair-wise cointegrated with SA. The level of cointegration varied depending on the portfolios, as did the error correction rates, impulse responses and variance decomposition. The one constant was that the USA dominated any portfolio where it was introduced. Recommendations were finally made about which market portfolio an investor should consider as most favourable.
- Full Text:
- Date Issued: 2011
- Authors: Petrov, Pavel
- Date: 2011
- Subjects: Cointegration Stock exchanges -- South Africa Stock exchanges -- Developing countries Stock exchanges -- Developed countries South Africa -- Economic conditions Portfolio management -- South Africa Econometrics Autoregression (Statistics)
- Language: English
- Type: Thesis , Masters , MCom
- Identifier: vital:5575 , http://hdl.handle.net/10962/d1005539
- Description: Cointegration has important implications for portfolio diversification. One of these is that in order to spread risk it is advisable to invest in markets that are not cointegrated. Over the last several decades communication technology has made the world a smaller place and hence cointegration in equity markets has become more prevalent. The bulk of research into cointegration focuses on developed and Asian markets, with little research been done on African markets. This study compares the Engle-Granger and Johansen tests for cointegration and uses them to calculate the level of cointegration between South African and other global equity markets. Each market is compared pair-wise with South Africa and the results have been that in general South Africa is cointegrated with other emerging markets but not really with African nor developed markets. Short-run analysis with the error correction was carried out and showed that in general markets respond slowly to any disequilibrium. Innovation accounting methods showed that the country placed first in Cholesky ordering dominates the other one. Multivariate cointegration was carried out using three selections of 4, 6 and 8 market portfolios. One of the markets was SA and the others were all chosen based on the criteria that they are not pair-wise cointegrated with SA. The level of cointegration varied depending on the portfolios, as did the error correction rates, impulse responses and variance decomposition. The one constant was that the USA dominated any portfolio where it was introduced. Recommendations were finally made about which market portfolio an investor should consider as most favourable.
- Full Text:
- Date Issued: 2011
Improved tree species discrimination at leaf level with hyperspectral data combining binary classifiers
- Authors: Dastile, Xolani Collen
- Date: 2011
- Subjects: Mathematical statistics , Analysis of variance , Nearest neighbor analysis (Statistics) , Trees--Classification
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:5567 , http://hdl.handle.net/10962/d1002807 , Mathematical statistics , Analysis of variance , Nearest neighbor analysis (Statistics) , Trees--Classification
- Description: The purpose of the present thesis is to show that hyperspectral data can be used for discrimination between different tree species. The data set used in this study contains the hyperspectral measurements of leaves of seven savannah tree species. The data is high-dimensional and shows large within-class variability combined with small between-class variability which makes discrimination between the classes challenging. We employ two classification methods: G-nearest neighbour and feed-forward neural networks. For both methods, direct 7-class prediction results in high misclassification rates. However, binary classification works better. We constructed binary classifiers for all possible binary classification problems and combine them with Error Correcting Output Codes. We show especially that the use of 1-nearest neighbour binary classifiers results in no improvement compared to a direct 1-nearest neighbour 7-class predictor. In contrast to this negative result, the use of neural networks binary classifiers improves accuracy by 10% compared to a direct neural networks 7-class predictor, and error rates become acceptable. This can be further improved by choosing only suitable binary classifiers for combination.
- Full Text:
- Date Issued: 2011
- Authors: Dastile, Xolani Collen
- Date: 2011
- Subjects: Mathematical statistics , Analysis of variance , Nearest neighbor analysis (Statistics) , Trees--Classification
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:5567 , http://hdl.handle.net/10962/d1002807 , Mathematical statistics , Analysis of variance , Nearest neighbor analysis (Statistics) , Trees--Classification
- Description: The purpose of the present thesis is to show that hyperspectral data can be used for discrimination between different tree species. The data set used in this study contains the hyperspectral measurements of leaves of seven savannah tree species. The data is high-dimensional and shows large within-class variability combined with small between-class variability which makes discrimination between the classes challenging. We employ two classification methods: G-nearest neighbour and feed-forward neural networks. For both methods, direct 7-class prediction results in high misclassification rates. However, binary classification works better. We constructed binary classifiers for all possible binary classification problems and combine them with Error Correcting Output Codes. We show especially that the use of 1-nearest neighbour binary classifiers results in no improvement compared to a direct 1-nearest neighbour 7-class predictor. In contrast to this negative result, the use of neural networks binary classifiers improves accuracy by 10% compared to a direct neural networks 7-class predictor, and error rates become acceptable. This can be further improved by choosing only suitable binary classifiers for combination.
- Full Text:
- Date Issued: 2011
- «
- ‹
- 1
- ›
- »