For linear classifiers, the relationship between (normalized) output margin and generalization is captured in a clear and simple bound -- a large output margin implies good generalization. Unfortunately, for deep models, this relationship is less clear: existing analyses of the output margin give complicated bounds which sometimes depend exponentially on depth. In this work, we propose to instead analyze a new notion of margin, which we call the "all-layer margin." Our analysis reveals that the all-layer margin has a clear and direct relationship with generalization for deep models. This enables the following concrete applications of the all-layer margin: 1) by analyzing the all-layer margin, we obtain tighter generalization bounds for neural nets which depend on Jacobian and hidden layer norms and remove the exponential dependency on depth 2) our neural net results easily translate to the adversarially robust setting, giving the first direct analysis of robust test error for deep networks, and 3) we present a theoretically inspired training algorithm for increasing the all-layer margin. Our algorithm improves both clean and adversarially robust test performance over strong baselines in practice.
翻译:对于线性分类器来说,(标准化的)输出边距和一般化之间的关系是在一个明确和简单的约束中捕捉到的 -- -- 巨大的输出边距意味着良好的概括化。不幸的是,对于深层模型来说,这种关系并不那么清楚:对输出边距的现有分析提供了复杂的界限,这些界限有时取决于深度。在这项工作中,我们提议对一个新的边距概念进行分析,我们称之为“所有层边距”。我们的分析表明,所有层边距与深层模型的概括化有着明确和直接的关系。这样就能够具体应用所有层边距:1)通过分析全层边距,我们获得了神经网更加严格的一般化界限,这取决于Jacobian和隐藏层的规范,并消除了对深度的指数依赖 2)我们神经网结果很容易转化为对抗性的强势环境,对深网络的强力测试错误进行第一次直接分析。和3)我们提出了一个理论启发的培训算法,以扩大全层边距。我们的算法改进了强势基线的清洁和对立性测试性性。