Managing the threat posed by malware requires accurate detection and classification techniques. Traditional detection strategies, such as signature scanning, rely on manual analysis of malware to extract relevant features, which is labor intensive and requires expert knowledge. Function call graphs consist of a set of program functions and their inter-procedural calls, providing a rich source of information that can be leveraged to classify malware without the labor intensive feature extraction step of traditional techniques. In this research, we treat malware classification as a graph classification problem. Based on Local Degree Profile features, we train a wide range of Graph Neural Network (GNN) architectures to generate embeddings which we then classify. We find that our best GNN models outperform previous comparable research involving the well-known MalNet-Tiny Android malware dataset. In addition, our GNN models do not suffer from the overfitting issues that commonly afflict non-GNN techniques, although GNN models require longer training times.
翻译:管理恶意软件的威胁需要准确的检测和分类技术。传统的检测策略,例如签名扫描,依赖于对恶意软件的手动分析以提取相关特征,这种方法需要耗费大量的劳动力并需要专家知识。函数调用图由一组程序函数和它们之间的过程调用组成,提供了一种丰富的信息源,可以利用它来对恶意软件进行分类,而无需进行传统技术中繁重的特征提取步骤。在这项研究中,我们将恶意软件分类作为图分类问题。基于局部度量特征,我们训练了广泛的图神经网络(GNN)架构,生成我们随后进行分类的嵌入。我们发现,我们的最佳GNN模型在跟先前相关的MalNet-Tiny安卓恶意软件数据集的研究相比表现更好。此外,我们的GNN模型不会遭受非GNN技术常见的过度拟合问题,但是GNN模型需要更长的训练时间。