Cyber-attacks continue to grow, both in terms of volume and sophistication. This is aided by an increase in available computational power, expanding attack surfaces, and advancements in the human understanding of how to make attacks undetectable. Unsurprisingly, machine learning is utilised to defend against these attacks. In many applications, the choice of features is more important than the choice of model. A range of studies have, with varying degrees of success, attempted to discriminate between benign traffic and well-known cyber-attacks. The features used in these studies are broadly similar and have demonstrated their effectiveness in situations where cyber-attacks do not imitate benign behaviour. To overcome this barrier, in this manuscript, we introduce new features based on a higher level of abstraction of network traffic. Specifically, we perform flow aggregation by grouping flows with similarities. This additional level of feature abstraction benefits from cumulative information, thus qualifying the models to classify cyber-attacks that mimic benign traffic. The performance of the new features is evaluated using the benchmark CICIDS2017 dataset, and the results demonstrate their validity and effectiveness. This novel proposal will improve the detection accuracy of cyber-attacks and also build towards a new direction of feature extraction for complex ones.
翻译:网络攻击在数量和复杂程度方面都继续增长,这得益于现有计算能力增加,攻击表面扩大,以及人类对如何使攻击无法察觉的认识提高。毫不奇怪,机器学习被用来防御这些攻击。在许多应用中,选择特征比选择模型更重要。一系列研究在不同的程度上试图区分良性交通和众所周知的网络攻击。这些研究中使用的特征大致相似,表明在网络攻击不模仿良性行为的情况下这些特征是有效的。为了克服这一障碍,我们在本手稿中引入了基于更高程度的网络交通抽象化的新特征。具体地说,我们通过将流动与相似性进行分组来进行流动汇总。从累积信息中获取更多程度的特征抽象效益,从而将模型用于对模拟友好交通的网络攻击进行分类,这些新特征的性能是使用基准CICIDS-2017数据集进行评估的,并展示其有效性和有效性。这个新提案还将改进对网络攻击和新特征的复杂深度的探测。