We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than optimal PAC learning. As we show, any robust algorithm with diminishing attackable rate can achieve the optimal dependence on $\epsilon$ in its PAC sample complexity, i.e., $O(1/\epsilon)$. On the other hand, the attackable rate might be large even for some optimal PAC learners, e.g., SVM for linear classifiers. Furthermore, we show that the class of linear hypotheses is not robustly learnable when the data distribution has zero margin and is robustly learnable in the case of positive margin but requires sample complexity exponential in the dimension. For a general hypothesis class with bounded VC dimension, if the attacker is limited to add at most $t>0$ poison examples, the optimal robust learning sample complexity grows almost linearly with $t$.
翻译:我们研究了在清洁标签数据渗透式攻击下强力学习的问题,在这种攻击者向训练所输入(任意的一组)正确标签的例子,以欺骗算法,使算法在测试时在具体测试实例中犯错误。学习的目标是尽量减少攻击率(攻击性试验实例的概率质量),这比最佳PAC学习更为困难。正如我们所显示的那样,任何以降低攻击率的强力算法,只要其PAC样本复杂程度,即O(1/\epsilon)美元,就可以达到对美元的最适当依赖性。另一方面,即使是一些最佳PAC学习者,例如线性分类者,SVM,攻击率也可能很大。此外,我们表明,当数据分布为零差值时,线性假体的等级不能强有力地学习,在正差的情况下,但需要严格地学习精度的精度复杂度。对于具有捆绑VC层面的一般假设类,如果攻击者只增加最高值为$>0美元的毒药例,则攻击率可能很大。