Adversarially trained models exhibit a large generalization gap: they can interpolate the training set even for large perturbation radii, but at the cost of large test error on clean samples. To investigate this gap, we decompose the test risk into its bias and variance components. We find that the bias increases monotonically with perturbation size and is the dominant term in the risk. Meanwhile, the variance is unimodal, peaking near the interpolation threshold for the training set. In contrast, we show that popular explanations for the generalization gap instead predict the variance to be monotonic, which leaves an unresolved mystery. We show that the same unimodal variance appears in a simple high-dimensional logistic regression problem, as well as for randomized smoothing. Overall, our results highlight the power of bias-variance decompositions in modern settings--by providing two measurements instead of one, they can rule out some theories and clarify others.
翻译:接受过反向培训的模型显示出一个巨大的概括性差距:它们可以对即使大扰动的半导射线的训练进行内插,但代价是清洁样品的大规模测试错误。为了调查这一差距,我们将试验风险分解为偏差和差异部分。我们发现,这种偏差随着扰动的大小而单向增加,是风险中的主要术语。同时,差异是单式的,接近培训集的内插阈值。相反,我们显示,对普遍化差距的流行解释却预测差异是单调的,这留下一个未解的谜。我们显示,同样的单式差异出现在一个简单的高维后勤回归问题中,以及随机的平滑。总体而言,我们的结果凸显了现代环境中偏差分解的威力,通过提供两种测量,而不是一种测量,它们可以排除一些理论并澄清其他理论。