As Android malware is growing and evolving, deep learning has been introduced into malware detection, resulting in great effectiveness. Recent work is considering hybrid models and multi-view learning. However, they use only simple features, limiting the accuracy of these approaches in practice. In this paper, we propose DeepCatra, a multi-view learning approach for Android malware detection, whose model consists of a bidirectional LSTM (BiLSTM) and a graph neural network (GNN) as subnets. The two subnets rely on features extracted from statically computed call traces leading to critical APIs derived from public vulnerabilities. For each Android app, DeepCatra first constructs its call graph and computes call traces reaching critical APIs. Then, temporal opcode features used by the BiLSTM subnet are extracted from the call traces, while flow graph features used by the GNN subnet are constructed from all the call traces and inter-component communications. We evaluate the effectiveness of DeepCatra by comparing it with several state-of-the-art detection approaches. Experimental results on over 18,000 real-world apps and prevalent malware show that DeepCatra achieves considerable improvement, e.g., 2.7% to 14.6% on F1-measure, which demonstrates the feasibility of DeepCatra in practice.
翻译:由于机器人恶意软件正在不断增长和演变,因此在恶意软件检测中引入了深层次的学习,从而产生了巨大的效果。最近的工作正考虑混合模型和多视图学习。然而,它们只使用简单的特性,限制了这些方法在实践中的准确性。在本文件中,我们提议采用DeepCatra,这是安机器人恶意软件检测的多视角学习方法,其模型包括双向LSTM(BilsTM)和作为子网的图形神经网络(GNN),两个子网依赖于从静态计算呼叫痕迹中提取的功能,这些功能导致从公共脆弱性中产生关键API。对于每部安纳特应用程序来说,DeepCatra首先构建其调用图和编译调达到关键API的痕迹。然后,BILSTM子网所使用的时间代码代码从调频谱中提取,而GNN子网使用的流程图特征则从所有调频谱和元素间通信中构建。我们通过将DeepCatra系统与若干状态检测方法进行比较来评估其有效性。在18000次的深方程式上实验结果, 显示深层C软件到14度的可行性。