持续学习 Android Maware 探测器 (Continuous Learning for Android Malware Detection)

Machine learning methods can detect Android malware with very high accuracy. However, these classifiers have an Achilles heel, concept drift: they rapidly become out of date and ineffective, due to the evolution of malware apps and benign apps. Our research finds that, after training an Android malware classifier on one year's worth of data, the F1 score quickly dropped from 0.99 to 0.76 after 6 months of deployment on new test samples. In this paper, we propose new methods to combat the concept drift problem of Android malware classifiers. Since machine learning technique needs to be continuously deployed, we use active learning: we select new samples for analysts to label, and then add the labeled samples to the training set to retrain the classifier. Our key idea is, similarity-based uncertainty is more robust against concept drift. Therefore, we combine contrastive learning with active learning. We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier. Our evaluation shows that this leads to significant improvements, compared to previously published methods for active learning. Our approach reduces the false negative rate from 16% (for the best baseline) to 10%, while maintaining the same false positive rate (0.6%). Also, our approach maintains more consistent performance across a seven-year time period than past methods.

翻译：机器学习方法可以非常精确地检测到机体恶意软件。但是,这些分类方法可以非常精确地检测到机体恶意软件。然而,这些分类方法具有一种“ 致命的脚跟”, 概念漂移: 由于恶意软件应用程序和良性软件的演变, 它们迅速过时和无效。我们的研究发现, 在用一年的数据来训练机体恶意软件分类师之后, F1 得分从0.99 下降到 0.76 。在新测试样本上部署六个月后, F1 得分迅速从 0.99 下降到 0.76 。在本文中,我们提出了对付机器恶意软件分类师概念漂移问题的新方法。由于机器学习技术需要不断应用,我们使用了积极的学习方法:我们为分析师选择新的样本进行标签,然后将标签样本添加到对分类师进行再培训。我们的关键想法是, 类似的不确定性比概念漂移更强。因此, 我们提出了一个新的等级对比学习方案, 和新的样本选择方法可以持续训练安体型恶意软件分类师。我们的评估显示, 与以前公布的积极学习的方法相比, 有了显著的改进, 。我们的方法比以前公布的方法 10年的负率。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

54+阅读 · 2021年1月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

45+阅读 · 2020年12月18日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日