用于识别恶意软件的自我监督愿景变换器 (Self-Supervised Vision Transformers for Malware Detection)

Malware detection plays a crucial role in cyber-security with the increase in malware growth and advancements in cyber-attacks. Previously unseen malware which is not determined by security vendors are often used in these attacks and it is becoming inevitable to find a solution that can self-learn from unlabeled sample data. This paper presents SHERLOCK, a self-supervision based deep learning model to detect malware based on the Vision Transformer (ViT) architecture. SHERLOCK is a novel malware detection method which learns unique features to differentiate malware from benign programs with the use of image-based binary representation. Experimental results using 1.2 million Android applications across a hierarchy of 47 types and 696 families, shows that self-supervised learning can achieve an accuracy of 97% for the binary classification of malware which is higher than existing state-of-the-art techniques. Our proposed model is also able to outperform state-of-the-art techniques for multi-class malware classification of types and family with macro-F1 score of .497 and .491 respectively.

翻译：恶意软件检测在网络安全方面发挥着关键作用,因为恶意软件增长和网络攻击的进展增加。以往由安全供应商不确定的不为人知的恶意软件在这些袭击中经常被使用,而找到一种能够从未贴标签的样本数据中自行解脱的解决方案也变得不可避免。本文展示了SherLOCK, 这是一种基于视觉变异器(VIT)结构的基于自我监督的深层次学习模型,用以检测恶意软件。 SherLOCK是一种新颖的恶意软件检测方法,它学会了将恶意软件与使用基于图像的二进制代表法的良性程序区别开来的独特性。实验结果,在47种和696个家庭等级中,使用120万个类和机器人应用程序,表明自我监督的学习可以达到97%的准确率,用于对恶意软件进行二元分类,该分类高于现有的最新技术。我们提议的模型还能够超越多种类型和家庭的多级恶意软件分类的先进技术,而宏观F1分数分别为497和491分。