利用大型网络代表学习系统探测机器人毛毛虫 (Android Malware Detection using Large-scale Network Representation Learning)

With the growth of mobile devices and applications, the number of malicious software, or malware, is rapidly increasing in recent years, which calls for the development of advanced and effective malware detection approaches. Traditional methods such as signature-based ones cannot defend users from an increasing number of new types of malware or rapid malware behavior changes. In this paper, we propose a new Android malware detection approach based on deep learning and static analysis. Instead of using Application Programming Interfaces (APIs) only, we further analyze the source code of Android applications and create their higher-level graphical semantics, which makes it harder for attackers to evade detection. In particular, we use a call graph from method invocations in an Android application to represent the application, and further analyze method attributes to form a structured Program Representation Graph (PRG) with node attributes. Then, we use a graph convolutional network (GCN) to yield a graph representation of the application by embedding the entire graph into a dense vector, and classify whether it is a malware or not. To efficiently train such a graph convolutional network, we propose a batch training scheme that allows multiple heterogeneous graphs to be input as a batch. To the best of our knowledge, this is the first work to use graph representation learning for malware detection. We conduct extensive experiments from real-world sample collections and demonstrate that our developed system outperforms multiple other existing malware detection techniques.

翻译：随着移动设备和应用程序的增长,恶意软件或恶意软件的数量近年来正在迅速增加,这就要求开发先进和有效的恶意软件检测方法。传统方法,例如基于签名的方法无法保护用户不受越来越多的新型恶意软件或快速恶意行为变化的侵害。在本文中,我们提议采用基于深层学习和静态分析的新的Android恶意软件检测方法。我们不只使用应用程序程序接口(API),而是进一步分析 Android应用程序的源代码,并创建其更高层次的图形语义,这使得攻击者更难躲避检测。特别是,我们使用一个来自安卓应用程序中手法搜索方法的呼唤图来代表应用程序,并进一步分析方法属性以形成一个结构化的程序描述图(PRG ) 。然后,我们使用一个图形革命网络(GCN)来通过将整个图形嵌入一个密度矢量的矢量来绘制应用程序的图表,并分类它是否是一种恶意软件,从而使攻击者更难于被检测。我们建议用一个来自安卓特应用的方法来代表应用程序的呼调图表图图图图图图图图图解,我们用一个系统来进行多重的测试。我们当前模拟的模型的模型的模型的模型,我们用来学习模型,我们现有的模型的模型的模型,我们现在的模型的模型的模型是用来用来用来做成品样图式的模型的模型的图。我们现在的模型的模型的模型的模型的图。