对恶意包装分类者进行的解释性后门中毒袭击 (Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers)

Training pipelines for machine learning (ML) based malware classification often rely on crowdsourced threat feeds, exposing a natural attack injection point. In this paper, we study the susceptibility of feature-based ML malware classifiers to backdoor poisoning attacks, specifically focusing on challenging "clean label" attacks where attackers do not control the sample labeling process. We propose the use of techniques from explainable machine learning to guide the selection of relevant features and values to create effective backdoor triggers in a model-agnostic fashion. Using multiple reference datasets for malware classification, including Windows PE files, PDFs, and Android applications, we demonstrate effective attacks against a diverse set of machine learning models and evaluate the effect of various constraints imposed on the attacker. To demonstrate the feasibility of our backdoor attacks in practice, we create a watermarking utility for Windows PE files that preserves the binary's functionality, and we leverage similar behavior-preserving alteration methodologies for Android and PDF files. Finally, we experiment with potential defensive strategies and show the difficulties of completely defending against these attacks, especially when the attacks blend in with the legitimate sample distribution.

翻译：机器学习(ML)的恶意软件分类培训管道往往依赖多方源威胁反馈,暴露了自然攻击注射点。在本文中,我们研究了基于特性的 ML 恶意软件分类器对后门中毒袭击的易感性,特别侧重于挑战“清洁标签”攻击,攻击者不控制标注过程。我们建议使用来自可解释的机器学习的技术来指导相关特征和价值观的选择,以模型-不可知的方式创造有效的后门触发器。我们利用多种参考数据集来进行恶意软件分类,包括Windows PE文档、PDFs和Android应用程序,我们展示了对各种机器学习模型的有效攻击,并评估了对攻击者施加的各种限制的影响。为了展示我们幕后攻击的实际可行性,我们为Windows PE文档创建了一个水标记工具,以维护二进制功能,我们用类似的行为保护改变方法来保护Android和PDF文件。最后,我们试验了潜在的防御策略,并展示了完全防御这些攻击的困难,特别是在攻击与合法样品分布混合时。

相关内容

Machine Learning

关注 2241

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【ICLR2021】神经元注意力蒸馏消除DNN中的后门触发器

专知会员服务

15+阅读 · 2021年1月31日

不可错过！UIUC最新《对抗机器学习》课程，附PPT

专知会员服务

35+阅读 · 2020年12月28日

深度学习的对抗攻击与防御方法综述

专知会员服务

99+阅读 · 2020年12月8日