在恶意误贴标签和中毒袭击中积极学习 (Active Learning Under Malicious Mislabeling and Poisoning Attacks)

Deep neural networks usually require large labeled datasets for training to achieve state-of-the-art performance in many tasks, such as image classification and natural language processing. Although a lot of data is created each day by active Internet users, most of these data are unlabeled and are vulnerable to data poisoning attacks. In this paper, we develop an efficient active learning method that requires fewer labeled instances and incorporates the technique of adversarial retraining in which additional labeled artificial data are generated without increasing the budget of the labeling. The generated adversarial examples also provide a way to measure the vulnerability of the model. To check the performance of the proposed method under an adversarial setting, i.e., malicious mislabeling and data poisoning attacks, we perform an extensive evaluation on the reduced CIFAR-10 dataset, which contains only two classes: airplane and frog. Our experimental results demonstrate that the proposed active learning method is efficient for defending against malicious mislabeling and data poisoning attacks. Specifically, whereas the baseline active learning method based on the random sampling strategy performs poorly (about 50%) under a malicious mislabeling attack, the proposed active learning method can achieve the desired accuracy of 89% using only one-third of the dataset on average.

翻译：深心神经网络通常需要大量的标签数据集来进行培训,以达到许多任务(如图像分类和自然语言处理等)中最先进的性能。虽然活跃的互联网用户每天创造大量数据,但这些数据大多没有标签,容易发生数据中毒袭击。在本文件中,我们开发了高效的积极学习方法,这种方法需要较少标签实例,并结合了对抗性再培训技术,在不增加标签预算的情况下产生额外的标签人工数据。生成的对抗性实例也为衡量模型脆弱性提供了一种方法。在恶意贴标签攻击的情况下,检查拟议方法的性能,即恶意贴标签和数据中毒攻击,我们对减少的CIFAR-10数据集进行了广泛的评估,该数据集仅包含两个类别:飞机和青蛙。我们的实验结果显示,拟议的积极学习方法对于防范恶意贴标签和数据中毒攻击是有效的。具体地说,根据随机抽样战略制定的基准积极学习方法在恶意贴标签攻击下表现很差(约50%),而拟议的积极学习方法仅能达到89%理想的准确性,仅使用三分之一的数据。

相关内容

主动学习

关注 240

主动学习是机器学习（更普遍的说是人工智能）的一个子领域，在统计学领域也叫查询学习、最优实验设计。“学习模块”和“选择策略”是主动学习算法的2个基本且重要的模块。主动学习是“一种学习方法，在这种方法中，学生会主动或体验性地参与学习过程，并且根据学生的参与程度，有不同程度的主动学习。” （Bonwell＆Eison 1991）Bonwell＆Eison（1991）指出：“学生除了被动地听课以外，还从事其他活动。” 在高等教育研究协会（ASHE）的一份报告中，作者讨论了各种促进主动学习的方法。他们引用了一些文献，这些文献表明学生不仅要做听，还必须做更多的事情才能学习。他们必须阅读，写作，讨论并参与解决问题。此过程涉及三个学习领域，即知识，技能和态度（KSA）。这种学习行为分类法可以被认为是“学习过程的目标”。特别是，学生必须从事诸如分析，综合和评估之类的高级思维任务。

自监督学习最新研究进展

专知会员服务

77+阅读 · 2021年3月24日

近期必读的六篇AAAI 2021【对抗攻击（Adversarial Attack）】相关论文和代码

专知会员服务

55+阅读 · 2021年2月17日

【经典书】使用机器学习R语言，149页pdf，Practical Machine Learning in R

专知会员服务

24+阅读 · 2021年1月13日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日