An Evaluation of Machine Learning Methods for Classifying Bot Traffic in Software Defined Networks
- Van Staden, Joshua, Brown, Dane L
- Authors: Van Staden, Joshua , Brown, Dane L
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463357 , vital:76402 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-19-7874-6_72"
- Description: Internet security is an ever-expanding field. Cyber-attacks occur very frequently, and so detecting them is an important aspect of preserving services. Machine learning offers a helpful tool with which to detect cyber attacks. However, it is impossible to deploy a machine-learning algorithm to detect attacks in a non-centralized network. Software Defined Networks (SDNs) offer a centralized view of a network, allowing machine learning algorithms to detect malicious activity within a network. The InSDN dataset is a recently-released dataset that contains a set of sniffed packets within a virtual SDN. These sniffed packets correspond to various attacks, including DDoS attacks, Probing and Password-Guessing, among others. This study aims to evaluate various machine learning models against this new dataset. Specifically, we aim to evaluate their classification ability and runtimes when trained on fewer features. The machine learning models tested include a Neural Network, Support Vector Machine, Random Forest, Multilayer Perceptron, Logistic Regression, and K-Nearest Neighbours. Cluster-based algorithms such as the K-Nearest Neighbour and Random Forest proved to be the best performers. Linear-based algorithms such as the Multilayer Perceptron performed the worst. This suggests a good level of clustering in the top few features with little space for linear separability. The reduction of features significantly reduced training time, particularly in the better-performing models.
- Full Text:
- Date Issued: 2023
- Authors: Van Staden, Joshua , Brown, Dane L
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463357 , vital:76402 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-19-7874-6_72"
- Description: Internet security is an ever-expanding field. Cyber-attacks occur very frequently, and so detecting them is an important aspect of preserving services. Machine learning offers a helpful tool with which to detect cyber attacks. However, it is impossible to deploy a machine-learning algorithm to detect attacks in a non-centralized network. Software Defined Networks (SDNs) offer a centralized view of a network, allowing machine learning algorithms to detect malicious activity within a network. The InSDN dataset is a recently-released dataset that contains a set of sniffed packets within a virtual SDN. These sniffed packets correspond to various attacks, including DDoS attacks, Probing and Password-Guessing, among others. This study aims to evaluate various machine learning models against this new dataset. Specifically, we aim to evaluate their classification ability and runtimes when trained on fewer features. The machine learning models tested include a Neural Network, Support Vector Machine, Random Forest, Multilayer Perceptron, Logistic Regression, and K-Nearest Neighbours. Cluster-based algorithms such as the K-Nearest Neighbour and Random Forest proved to be the best performers. Linear-based algorithms such as the Multilayer Perceptron performed the worst. This suggests a good level of clustering in the top few features with little space for linear separability. The reduction of features significantly reduced training time, particularly in the better-performing models.
- Full Text:
- Date Issued: 2023
Darknet Traffic Detection Using Histogram-Based Gradient Boosting
- Brown, Dane L, Sepula, Chikondi
- Authors: Brown, Dane L , Sepula, Chikondi
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464063 , vital:76472 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_59"
- Description: The network security sector has observed a rise in severe attacks emanating from the darknet or encrypted networks in recent years. Network intrusion detection systems (NIDS) capable of detecting darknet or encrypted traffic must be developed to increase system security. Machine learning algorithms can effectively detect darknet activities when trained on encrypted and conventional network data. However, the performance of the system may be influenced, among other things, by the choice of machine learning models, data preparation techniques, and feature selection methodologies. The histogram-based gradient boosting strategy known as categorical boosting (CatBoost) was tested to see how well it could find darknet traffic. The performance of the model was examined using feature selection strategies such as correlation coefficient, variance threshold, SelectKBest, and recursive feature removal (RFE). Following the categorization of traffic as “darknet” or “regular”, a multi-class classification was used to determine the software application associated with the traffic. Further study was carried out on well-known machine learning methods such as random forests (RF), decision trees (DT), linear support vector classifier (SVC Linear), and long-short term memory (LST) (LSTM). The proposed model achieved good results with 98.51% binary classification accuracy and 88% multi-class classification accuracy.
- Full Text:
- Date Issued: 2023
- Authors: Brown, Dane L , Sepula, Chikondi
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464063 , vital:76472 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_59"
- Description: The network security sector has observed a rise in severe attacks emanating from the darknet or encrypted networks in recent years. Network intrusion detection systems (NIDS) capable of detecting darknet or encrypted traffic must be developed to increase system security. Machine learning algorithms can effectively detect darknet activities when trained on encrypted and conventional network data. However, the performance of the system may be influenced, among other things, by the choice of machine learning models, data preparation techniques, and feature selection methodologies. The histogram-based gradient boosting strategy known as categorical boosting (CatBoost) was tested to see how well it could find darknet traffic. The performance of the model was examined using feature selection strategies such as correlation coefficient, variance threshold, SelectKBest, and recursive feature removal (RFE). Following the categorization of traffic as “darknet” or “regular”, a multi-class classification was used to determine the software application associated with the traffic. Further study was carried out on well-known machine learning methods such as random forests (RF), decision trees (DT), linear support vector classifier (SVC Linear), and long-short term memory (LST) (LSTM). The proposed model achieved good results with 98.51% binary classification accuracy and 88% multi-class classification accuracy.
- Full Text:
- Date Issued: 2023
Improving licence plate detection using generative adversarial networks
- Authors: Boby, Alden , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464145 , vital:76480 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-04881-4_47"
- Description: The information on a licence plate is used for traffic law enforcement, access control, surveillance and parking lot management. Existing li-cence plate recognition systems work with clear images taken under controlled conditions. In real-world licence plate recognition scenarios, images are not as straightforward as the ‘toy’ datasets used to bench-mark existing systems. Real-world data is often noisy as it may contain occlusion and poor lighting, obscuring the information on a licence plate. Cleaning input data before using it for licence plate recognition is a complex problem, and existing literature addressing the issue is still limited. This paper uses two deep learning techniques to improve li-cence plate visibility towards more accurate licence plate recognition. A one-stage object detector popularly known as YOLO is implemented for locating licence plates under challenging situations. Super-resolution generative adversarial networks are considered for image upscaling and reconstruction to improve the clarity of low-quality input. The main focus involves training these systems on datasets that include difficult to detect licence plates, enabling better performance in unfavourable conditions and environments.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464145 , vital:76480 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-04881-4_47"
- Description: The information on a licence plate is used for traffic law enforcement, access control, surveillance and parking lot management. Existing li-cence plate recognition systems work with clear images taken under controlled conditions. In real-world licence plate recognition scenarios, images are not as straightforward as the ‘toy’ datasets used to bench-mark existing systems. Real-world data is often noisy as it may contain occlusion and poor lighting, obscuring the information on a licence plate. Cleaning input data before using it for licence plate recognition is a complex problem, and existing literature addressing the issue is still limited. This paper uses two deep learning techniques to improve li-cence plate visibility towards more accurate licence plate recognition. A one-stage object detector popularly known as YOLO is implemented for locating licence plates under challenging situations. Super-resolution generative adversarial networks are considered for image upscaling and reconstruction to improve the clarity of low-quality input. The main focus involves training these systems on datasets that include difficult to detect licence plates, enabling better performance in unfavourable conditions and environments.
- Full Text:
- Date Issued: 2022
- «
- ‹
- 1
- ›
- »