Despite a great deal of research, it is still unclear why neural networks are so susceptible to adversarial examples. In this work, we identify natural settings where depth-$2$ ReLU networks trained with gradient flow are provably non-robust (susceptible to small adversarial $\ell_2$-perturbations), even when robust networks that classify the training dataset correctly exist. Perhaps surprisingly, we show that the well-known implicit bias towards margin maximization induces bias towards non-robust networks, by proving that every network which satisfies the KKT conditions of the max-margin problem is non-robust.
翻译:尽管进行了大量研究,但仍不清楚为什么神经网络如此容易受到对抗性例子的影响。 在这项工作中,我们发现自然环境里,受过梯度流训练的深度-2美元ReLU网络是非强力的(小型对立面 $\ ell_2美元-扰动 ), 即使对培训数据集进行正确分类的强大网络存在。 也许令人惊讶的是,我们显示出众所周知的对利润最大化的隐性偏向导致对非强力网络的偏向,通过证明满足KKT最大边际问题条件的每个网络都是非强力网络。