Neural networks work remarkably well in practice and theoretically they can be universal approximators. However, they still make mistakes and a specific type of them called adversarial errors seem inexcusable to humans. In this work, we analyze both test errors and adversarial errors on a well controlled but highly non-linear visual classification problem. We find that, when approximating training on infinite data, test errors tend to be close to the ground truth decision boundary. Qualitatively speaking these are also more difficult for a human. By contrast, adversarial examples can be found almost everywhere and are often obvious mistakes. However, when we constrain adversarial examples to the manifold, we observe a 90\% reduction in adversarial errors. If we inflate the manifold by training with Gaussian noise we observe a similar effect. In both cases, the remaining adversarial errors tend to be close to the ground truth decision boundary. Qualitatively, the remaining adversarial errors are similar to test errors on difficult examples. They do not have the customary quality of being inexcusable mistakes.
翻译:神经网络在实践中运作得非常好,理论上它们可能是普遍的近似体。 但是,它们仍然会犯错误, 被称为对抗性错误的具体类型似乎对人来说是不可原谅的。 在这项工作中,我们分析了一个控制良好但高度非线性视觉分类问题的测试错误和对抗性错误。我们发现,当关于无限数据的培训接近于无限数据的培训时,测试错误往往接近于地面的真相决定界限。 定性地说,这些错误对人来说也比较困难。相反,对抗性例子几乎到处可见,而且往往是明显的错误。然而,当我们把对抗性例子限制在多面时,我们看到对抗性错误减少了90 ⁇ 。如果我们通过用高斯语的噪音训练使多重错误增加我们所观察到的类似效果。 在这两种情况下,剩余的对抗性错误往往接近于地面的真相决定界限。 定性地说,其余的对抗性错误类似于在困难的例子上测试错误。 它们并不具有习惯的可原谅性错误的质量。