In supervised learning, it has been shown that label noise in the data can be interpolated without penalties on test accuracy. We show that interpolating label noise induces adversarial vulnerability, and prove the first theorem showing the relationship between label noise and adversarial risk for any data distribution. Our results are almost tight if we do not make any assumptions on the inductive bias of the learning algorithm. We then investigate how different components of this problem affect this result including properties of the distribution. We also discuss non-uniform label noise distributions; and prove a new theorem showing uniform label noise induces nearly as large an adversarial risk as the worst poisoning with the same noise rate. Then, we provide theoretical and empirical evidence that uniform label noise is more harmful than typical real-world label noise. Finally, we show how inductive biases amplify the effect of label noise and argue the need for future work in this direction.
翻译:在受监督的学习中,我们发现,数据中的标签噪音可以不受测试精确度的处罚而被内插。我们显示,内插标签噪音会诱发对抗性脆弱性,并证明第一个表明标签噪音和对抗性危险之间在任何数据分配中的关系的理论。如果我们不对学习算法的诱导偏差作出任何假设,我们的结果几乎是紧凑的。我们然后调查这一问题的不同组成部分如何影响这一结果,包括分布特性。我们还讨论非统一标签噪音的特性。我们还讨论非统一标签噪音的分布;并证明显示统一标签噪音的新理论会引发几乎与最坏的中毒同样大的对抗性风险,而同样的噪音率。然后,我们提供理论和经验证据,表明统一标签噪音比典型的真实世界标签噪音更有害。最后,我们展示了暗示性偏见如何扩大标签噪音的影响,并论证今后需要朝这个方向开展工作。