Understanding the generalization of deep neural networks is one of the most important tasks in deep learning. Although much progress has been made, theoretical error bounds still often behave disparately from empirical observations. In this work, we develop margin-based generalization bounds, where the margins are normalized with optimal transport costs between independent random subsets sampled from the training distribution. In particular, the optimal transport cost can be interpreted as a generalization of variance which captures the structural properties of the learned feature space. Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets. Theoretically, we demonstrate that the concentration and separation of features play crucial roles in generalization, supporting empirical results in the literature. The code is available at \url{https://github.com/chingyaoc/kV-Margin}.
翻译:深层神经网络的普遍化是深层学习中最重要的任务之一。虽然已经取得了很大进展,但理论错误的范围仍然常常与经验观察不同。在这项工作中,我们开发了基于边际的一般化界限,使边距与从培训分布中抽样的独立随机子集之间的最佳运输成本正常化。特别是,最佳运输成本可以被解释为一种差异的简单化,它捕捉了所学特征空间的结构特性。我们的界限有力地预测了一般化错误,根据培训数据和网络参数,大规模数据集。理论上,我们证明特征的集中和分离在一般化中起着关键作用,支持文献中的经验结果。代码可在以下网站查阅:url{https://github.comchingyaoc/kV-Margin}。