Organizations such as government departments and financial institutions provide online service facilities accessible via an increasing number of internet connected devices which make their operational environment vulnerable to cyber attacks. Consequently, there is a need to have mechanisms in place to detect cyber security attacks in a timely manner. A variety of Network Intrusion Detection Systems (NIDS) have been proposed and can be categorized into signature-based NIDS and anomaly-based NIDS. The signature-based NIDS, which identify the misuse through scanning the activity signature against the list of known attack activities, are criticized for their inability to identify new attacks (never-before-seen attacks). Among anomaly-based NIDS, which declare a connection anomalous if it expresses deviation from a trained model, the unsupervised learning algorithms circumvent this issue since they have the ability to identify new attacks. In this study, we use an unsupervised learning algorithm based on principal component analysis to detect cyber attacks. In the training phase, our approach has the advantage of also identifying outliers in the training dataset. In the monitoring phase, our approach first identifies the affected dimensions and then calculates an anomaly score by aggregating across only those components that are affected by the anomalies. We explore the performance of the algorithm via simulations and through two applications, namely to the UNSW-NB15 dataset recently released by the Australian Centre for Cyber Security and to the well-known KDD'99 dataset. The algorithm is scalable to large datasets in both training and monitoring phases, and the results from both the simulated and real datasets show that the method has promise in detecting suspicious network activities.
翻译:诸如政府部门和金融机构等组织通过越来越多的互联网连接装置提供在线服务设施,使它们的业务环境易受网络攻击的伤害。因此,有必要建立机制,及时发现网络安全攻击。各种网络入侵探测系统已经提出,可以归类为基于签名的NIDS和基于异常的NIDS。基于签名的NIDS通过对已知袭击活动清单的活动签名签名信号进行扫描来识别滥用行为,受到批评,因为它们无法识别新的攻击(从来就没见过的攻击)。在基于异常的NIDS中,它宣布了一种连接异常,如果它表示偏离了经过训练的模式,则会显示一种连接异常。各种不受监督的学习算法(NIDS)已经提出,因为它们有能力识别新的攻击。在本研究中,我们使用基于主要组成部分分析的不受监督的学习算法来检测网络攻击。在培训阶段,我们的方法还有助于识别培训数据集中众所周知的外端点。在监测阶段,我们的方法首先确定受影响的层面,然后通过通过对数据库中的数据进行模拟,然后通过对数据库中的数据应用进行精确的评分算。